N1QL : Find latest status from an array - arrays

I have a document of type 'User' as-
{
"id":"User-1",
"Name": "Kevin",
"Gender":"M",
"Statuses":[
{
"Status":"ONLINE",
"StatusChangedDate":"2017-11-01T17:12:00Z"
},
{
"Status":"OFFLINE",
"StatusChangedDate":"2017-11-02T13:24:00Z"
},
{
"Status":"ONLINE",
"StatusChangedDate":"2017-11-02T14:35:00Z"
},
{
"Status":"OFFLINE",
"StatusChangedDate":"2017-11-02T15:47:00Z"
}.....
],
"type":"User"
}
I need user's information along with his latest status details based on a particular date (or date range).
I am able to achieve this using subquery and Unnest clause.
Select U.Name, U.Gender, S.Status, S.StatusChangedDate
From (Select U1.id, max(U1.StatusChangedDate) as StatusChangedDate
From UserInformation U1
Unnest Statuses S1
Where U1.type = 'User'
And U1.StatusChangedDate between '2017-11-02T08:00:00Z' And '2017-11-02T11:00:00Z'
And U1.Status = 'ONLINE'
Group by U1.id
) A
Join UserInformation U On Keys A.id
Unnest U.Statuses S
Where U.StatusChangedDate = A.StatusChangedDate;
But is there any other way of achieving this (like by using collection operators and array functions)??
If yes, please provide me a query or guide me through it.
Thanks.

MAX, MIN argument allows array. 0th element of array can be field needs aggregate and 1st element is what you want to carry.
Using this techinuqe you can project non aggregtae field for MIN/MAX like below.
SELECT U.Name, U.Gender, S.Status, S.StatusChangedDate
FROM UserInformation U1
UNNEST Statuses S1
WHERE U1.type = 'User'
AND S1.StatusChangedDate BETWEEN '2017-11-02T08:00:00Z' AND '2017-11-02T11:00:00Z'
AND S1.Status = 'ONLINE'
GROUP BY U1.id
LETTING S = MAX([S1.StatusChangedDate,S1])[1];
In 4.6.3+ you can also try this without UNNEST Using subquery expressions https://developer.couchbase.com/documentation/server/current/n1ql/n1ql-language-reference/subqueries.html . Array indexing query will be faster.
CREATE INDEX ix1 ON UserInformation(ARRAY v FOR v IN Statuses WHEN v.Status = 'ONLINE' END) WHERE type = "User";
SELECT U.Name, U.Gender, S.Status, S.StatusChangedDate
FROM UserInformation U1
LET S = (SELECT RAW MAX([S1.StatusChangedDate,S1])[1]
FROM U1.Statuses AS S1
WHERE S1.StatusChangedDate BETWEEN '2017-11-02T08:00:00Z' AND '2017-11-02T11:00:00Z' AND S1.Status = 'ONLINE')[0]
WHERE U1.type = 'User'
AND ANY v IN U1.Statuses SATISFIES
v.StatusChangedDate BETWEEN '2017-11-02T08:00:00Z' AND '2017-11-02T11:00:00Z' AND v.Status = 'ONLINE' END;

Related

EF6 query that behaves strangely using contains

I have an EF6 SQL Server query that behaves strangely when it is supplied with a List<int> of IDs to use. If bookGenieCategory = a value it works. If selectedAges is empty (count = 0) all is well. If the selectedAges contains values that exist in the ProductCategory.CategoryId column, the contains fails and NO rows are returned.
Note: AllocationCandidates is a view, which works properly on its own.
CREATE VIEW dbo.AllocationCandidate
AS
SELECT
p.ProductID, p.SKU as ISBN, p.Name as Title,
pv.MSRP, pv.Price, pv.VariantID, pv.Inventory,
ISNULL(plt.DateLastTouched, GETDATE()) AS DateLastTouched,
JSON_VALUE(p.MiscText, '$.AgeId') AS AgeId,
JSON_VALUE(p.MiscText, '$.AgeName') AS AgeName
FROM
dbo.Product AS p WITH (NOLOCK)
INNER JOIN
dbo.ProductVariant AS pv WITH (NOLOCK) ON pv.ProductID = p.ProductID
LEFT OUTER JOIN
dbo.KBOProductLastTouched AS plt WITH (NOLOCK) ON plt.ProductID = p.ProductID
WHERE
(ISJSON(p.MiscText) = 1)
AND (p.Deleted = 0)
AND (p.Published = 1)
AND (pv.IsDefault = 1)
GO
Do I have a typo here or a misplaced parenthesis in the following query?
var returnList = (from ac in _db.AllocationCandidates
join pc in _db.ProductCategories on ac.ProductID equals pc.ProductID
where (bookGenieCategory == 0
|| bookGenieCategory == pc.CategoryID)
&&
(selectedAges.Count == 0 ||
selectedAges.Contains(pc.CategoryID))
orderby ac.AgeId, ac.DateLastTouched descending
select ac).ToList();
Firstly, I would recommend extracting the conditionals outside of the Linq expression. If you only want to filter data if a value is provided, move the condition check outside of the Linq rather than embedding it inside the condition. This is generally easier to do with the Fluent Linq than Linq QL. You should also aim to leverage navigation properties for relationships between entities. This way an AllocationCandidate should have a collection of ProductCategories:
var query = _db.AllocationCandidates.AsQueryable();
if (bookGenieCategory != 0)
query = query.Where(x => x.ProductCategories.Any(c => c.CategoryID == bookGenieCategory);
The next question is what does the selectedAges contains? There is an Age ID on the AllocationCandidate, but your original query is checking against the ProductCategory.CategoryId??
If the check should be against the AllocationCandidate.AgeId:
if (selectedAges.Any())
query = query.Where(x => selectedAges.Contains(x.AgeID));
If the check is as you wrote it against the ProductCategory.CategoryId:
if (selectedAges.Any())
query = query.Where(x => x.ProductCategories.Any(c => selectedAges.Contains(c.AgeID)));
Then add your order by and get your results:
var results = query.OrderBy(x => x.AgeId)
.ThenByDescending(x => x.DateLastTouched);
.ToList();

String '(LISTAGG result)' is too long and would be truncated - How to fix this?

SELECT DISTINCT B.ID, LISTAGG(D.PROCESS_ID,',')
FROM table1 A
JOIN table2 B
ON A.ST_ID = B.ST_ID
JOIN table2 C
ON A.ST_ID = C.ST_ID
JOIN table4 D
ON A.ST_ID = C.ST_ID
WHERE B.DATE = '2022-02-02'
AND FL_CD NOT IN ('1','2','3','4','5')
GROUP BY 1
When I run the above code, I'm getting this error- String '(LISTAGG result)' is too long and would be truncated. This is because of PROCESS_ID column as it is having huge values. Kindly provide a solution to fix this.
As many answers are calling, there's probably an exploding join that's creating way more data than expected.
In any case, I created an UDTF in JavaScript that can receive a table and extract a limited number of sample elements for each label — which is basically what the question is asking for:
create or replace function limited_array_agg(arr_limit float, g string, s string)
returns table (G string, S array)
language javascript
as $$
{
processRow: function f(row, rowWriter, context){
if(this.counter < row.ARR_LIMIT){
this.arr.push(row.S)
this.group = row.G
this.counter++;
};
}
, initialize: function(argumentInfo, context) {
this.counter = 0;
this.arr = [];
}, finalize: function(rowWriter, context){
rowWriter.writeRow({G:this.group, S: this.arr})
}
}
$$;
You can use it like this, where your query could be a CTE:
select limited_agg.g, limited_agg.s
from snowflake_sample_data.tpch_sf100.part
, table(limited_array_agg(
3::float
, p_mfgr
, p_name) over(partition by p_mfgr)) limited_agg
;
In the meantime: I wish Snowflake had a native array_agg() that limits the number of elements.

Parsing string with multiple delimiters into columns

I want to split strings into columns.
My columns should be:
account_id, resource_type, resource_name
I have a JSON file source that I have been trying to parse via ADF data flow. That hasn't worked for me, hence I flattened the data and brought it into SQL Server (I am open to parsing values via ADF or SQL if anyone can show me how). Please check the JSON file at the bottom.
Use this code to query the data I am working with.
CREATE TABLE test.test2
(
resource_type nvarchar(max) NULL
)
INSERT INTO test.test2 ([resource_type])
VALUES
('account_id:224526257458,resource_type:buckets,resource_name:camp-stage-artifactory'),
('account_id:535533456241,resource_type:buckets,resource_name:tni-prod-diva-backups'),
('account_id:369798452057,resource_type:buckets,resource_name:369798452057-s3-manifests'),
('account_id:460085747812,resource_type:buckets,resource_name:vessel-incident-report-nonprod-accesslogs')
The output that I should be able to query in SQL Server should like this:
account_id
resource_type
resource_name
224526257458
buckets
camp-stage-artifactory
535533456241
buckets
tni-prod-diva-backups
and so forth.
Please help me out and ask for clarification if needed. Thanks in advance.
EDIT:
Source JSON Format:
{
"start_date": "2021-12-01 00:00:00+00:00",
"end_date": "2021-12-31 23:59:59+00:00",
"resource_type": "all",
"records": [
{
"directconnect_connections": [
"account_id:227148359287,resource_type:directconnect_connections,resource_name:'dxcon-fh40evn5'",
"account_id:401311080156,resource_type:directconnect_connections,resource_name:'dxcon-ffxgf6kh'",
"account_id:401311080156,resource_type:directconnect_connections,resource_name:'dxcon-fg5j5v6o'",
"account_id:227148359287,resource_type:directconnect_connections,resource_name:'dxcon-fgvfo1ej'"
]
},
{
"virtual_interfaces": [
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-fgvj25vt'",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-fgbw5gs0'",
"account_id:401311080156,resource_type:virtual_interfaces,resource_name:'dxvif-ffnosohr'",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-fg18bdhl'",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-ffmf6h64'",
"account_id:390251991779,resource_type:virtual_interfaces,resource_name:'dxvif-fgkxjhcj'",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-ffp6kl3f'"
]
}
]
}
Since you don't have a valid JSON string and not wanting to get in the business of string manipulation... perhaps this will help.
Select B.*
From test2 A
Cross Apply ( Select account_id = max(case when value like 'account_id:%' then stuff(value,1,11,'') end )
,resource_type = max(case when value like 'resource_type:%' then stuff(value,1,14,'') end )
,resource_name = max(case when value like 'resource_name:%' then stuff(value,1,14,'') end )
from string_split(resource_type,',')
)B
Results
account_id resource_type resource_name
224526257458 buckets camp-stage-artifactory
535533456241 buckets tni-prod-diva-backups
369798452057 buckets 369798452057-s3-manifests
460085747812 buckets vessel-incident-report-nonprod-accesslogs
Unfortunately, the values inside the arrays are not valid JSON. You can patch them up by adding {} to the beginning/end, and adding " on either side of : and ,.
DECLARE #json nvarchar(max) = N'{
"start_date": "2021-12-01 00:00:00+00:00",
"end_date": "2021-12-31 23:59:59+00:00",
"resource_type": "all",
"records": [
{
"directconnect_connections": [
"account_id:227148359287,resource_type:directconnect_connections,resource_name:''dxcon-fh40evn5''",
"account_id:401311080156,resource_type:directconnect_connections,resource_name:''dxcon-ffxgf6kh''",
"account_id:401311080156,resource_type:directconnect_connections,resource_name:''dxcon-fg5j5v6o''",
"account_id:227148359287,resource_type:directconnect_connections,resource_name:''dxcon-fgvfo1ej''"
]
},
{
"virtual_interfaces": [
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:''dxvif-fgvj25vt''",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:''dxvif-fgbw5gs0''",
"account_id:401311080156,resource_type:virtual_interfaces,resource_name:''dxvif-ffnosohr''",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:''dxvif-fg18bdhl''",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:''dxvif-ffmf6h64''",
"account_id:390251991779,resource_type:virtual_interfaces,resource_name:''dxvif-fgkxjhcj''",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:''dxvif-ffp6kl3f''"
]
}
]
}';
SELECT
j4.account_id,
j4.resource_type,
TRIM('''' FROM j4.resource_name) resource_name
FROM OPENJSON(#json, '$.records') j1
CROSS APPLY OPENJSON(j1.value) j2
CROSS APPLY OPENJSON(j2.value) j3
CROSS APPLY OPENJSON('{"' + REPLACE(REPLACE(j3.value, ':', '":"'), ',', '","') + '"}')
WITH (
account_id bigint,
resource_type varchar(20),
resource_name varchar(100)
) j4;
db<>fiddle
The first three calls to OPENJSON have no schema, so the resultset is three columns: key value and type. In the case of arrays (j1 and j3), key is the index into the array. In the case of single objects (j2), key is each property name.

Get data from a table after joining based on null value of joined table using LINQ

I have some tables like "Job" "AppliedJob" "JobOffer" "Contract" "Employeer"
Description: When employeer post job it is stored in "Job" table with his ID. If a freelancer apply to the job it is stored in "AppliedJob" table with his ID. Then Employeer sees the application and send offer to the freelancer and it is stored in "JobOffer" table. If freelancer accept the offer it is then stored in "Contract" table. At first in "Contract" ContractID, OfferID and StratDate are stored and CompletedDate is stored as null. When the contract is completed the CompletedDate field is modified with date.
Want: I want to return all jobs with no completed contract
I tried:
[HttpGet]
[Route("api/PrivateApi/GetEmployeerPostedJob/")]
public object GetEmployeerPostedJob(int id)
{
var data = (from j in db.Jobs
where j.EmployeerID == id
join apl in db.AppliedJobs
on j.JobID equals apl.JobID
join o in db.JobOffers
on apl.AppliedJobID equals o.AppliedJobID
join con in db.Contracts
on o.OfferID equals con.OfferID
where con.CompletedDate == null
select new
{
j.JobTitle,
j.JobID,
j.Budget,
j.Deadline,
j.Employeer,
j.JobDetails,
j.PublishDate,
j.ReqSkill,
j.NoOFFreelancer,
j.Preference,
Category1=j.Category,
totalAppliedFreelancer=(from aple in db.AppliedJobs where j.JobID ==aple.JobID select aple).Count(),
Category = (from gg in db.Categories where gg.CategoryID == j.Category select gg.CategoryName).FirstOrDefault()
}).ToList();
return data.AsEnumerable();
}
But it returns no jobs.
How can i get all jobs which are not completed yet(CompletedDate == null in Contract table)?
I want to show the employeer's posted jobs on his page but only those jobs that are not completed
The above (die to one to many relationships involved) can be paraphrased as
return all jobs with no completed contract
while the way you wrote the query, besides the possible data duplication, it answers the question
return all jobs with existing, but not completed contract
i.e. is missing the jobs w/o applied job, applied job w/o offer and offer w/o contract.
The correct query would be something like this:
from job in db.Jobs
where job.EmployeerID == id
join jobContract in (
from appliedJob in db.AppliedJobs
join offer in db.JobOffers on appliedJob.AppliedJobID equals offer.AppliedJobID
join contract in db.Contracts on offer.OfferID equals contract.OfferID
select new { appliedJob, offer, contract }
) on job.JobID equals jobContract.appliedJob.JobID into jobContracts
where !jobContracts.Any(jobContract => jobContract.contract.CompletedDate != null)
select ...
The query could further be simplified by using navigation properties, but since I don't see navigation properties between AppliedJob and JobOffer, I'm leaving that for you.
Update: Here is the same query with navigation properties (by simplified I meant no need for join operators):
from job in db.Jobs
let completedContracts =
from appliedJob in job.AppliedJobs
from offer in appliedJob.JobOffers
from contract in offer.Contracts
where contract.CompletedDate != null
select contract
where !completedContracts.Any()
select ...
This LINQ query doesn't work on live server (sql-2008) But works on my local VS 2013. Is there any mistake if my live database is empty at first time?
#Ivan Stoev
public object BrowseJobs()
{
var skills = db.Skills.ToDictionary(d => d.SkillID, n => n.SkillName);
var jobData = (from j in db.Jobs where j.Preference==2
//from cj in j.ClosedJobs.DefaultIfEmpty()
join cj in db.ClosedJobs.DefaultIfEmpty()
on j.JobID equals cj.JobID into closedJob
where !closedJob.Any()
join c in db.Categories on j.Category equals c.CategoryID
join jobContract in
(
from appliedJob in db.AppliedJobs.DefaultIfEmpty()
from offer in appliedJob.JobOffers.DefaultIfEmpty()
from contract in db.Contracts.DefaultIfEmpty()
select new { appliedJob, offer, contract }
).DefaultIfEmpty()
on j.JobID equals jobContract.appliedJob.JobID into jobContracts
where !jobContracts.Any(jobContract => jobContract.contract.CompletedDate != null)
select new
{
JobTitle = j.JobTitle,
JobID = j.JobID,
ReqSkillCommaSeperated = j.ReqSkill,
Category = c.CategoryName,
Budget=j.Budget,
Deadline=j.Deadline,
JobDetails=j.JobDetails,
PublishDate=j.PublishDate,
TotalApplied=(from ap in db.AppliedJobs where j.JobID == ap.JobID select ap.AppliedJobID).DefaultIfEmpty().Count()
}).AsEnumerable()
.Select(x => new
{
JobID = x.JobID,
JobTitle = x.JobTitle,
Category = x.Category,
Budget = x.Budget,
Deadline = x.Deadline,
JobDetails = x.JobDetails,
PublishDate = x.PublishDate,
SkillNames = GetSkillName(x.ReqSkillCommaSeperated, skills),
TotalApplied = (from ap in db.AppliedJobs where x.JobID == ap.JobID select ap.AppliedJobID).DefaultIfEmpty().Count()
}).ToList();
return jobData.AsEnumerable();
}

SQL Query Help - searching on multiple 'pairs' or data

I'm struggling to work out how to do a SQL query on a database that I have.
I have a view (which can be changed) which shows the relationships between the tables.
This creates a view as follows:
What I need to be able to do is search on one or more 'Attribute Pairs'
for example
I want to search for records with:
(
(AttributeName='FileExtension' AND AttributeValue='.pdf')
AND (AttributeName='AccountNumber' AND AttributeValue='ABB001'
)
As you can tell, this is not working as AttributeName cant be two things at once. I have this working with an OR filter, but I want it to find records that have all attribute pairs
SELECT
dbo.SiconDMSDocument.SiconDMSDocumentID,
dbo.SiconDMSAttribute.SiconDMSAttributeID,
dbo.SiconDMSAttribute.AttributeFriendlyName,
dbo.SiconDMSAttribute.AttributeName,
dbo.SiconDMSDocumentAttribute.AttributeValue,
dbo.SiconDMSAttribute.DataType,
dbo.SiconDMSDocumentType.SiconDMSDocumentTypeID,
dbo.SiconDMSDocumentType.DocumentTypeName,
dbo.SiconDMSDocumentType.DocumentTypeFriendlyName,
dbo.SiconDMSModule.SiconDMSModuleID,
dbo.SiconDMSModule.ModuleName,
dbo.SiconDMSModule.ModuleFriendlyName,
dbo.SiconDMSDocument.SiconDMSDocumentTypeModuleID
FROM dbo.SiconDMSDocument
INNER JOIN dbo.SiconDMSDocumentAttribute ON dbo.SiconDMSDocument.SiconDMSDocumentID = dbo.SiconDMSDocumentAttribute.SiconDMSDocumentID
INNER JOIN dbo.SiconDMSAttribute ON dbo.SiconDMSDocumentAttribute.SiconDMSAttributeID = dbo.SiconDMSAttribute.SiconDMSAttributeID
AND
(
(dbo.SiconDMSAttribute.AttributeName = 'Reference' AND dbo.SiconDMSDocumentAttribute.AttributeValue='12345')
OR (dbo.SiconDMSAttribute.AttributeName = 'AccountNumber' AND dbo.SiconDMSDocumentAttribute.AttributeValue='ABB001')
)
INNER JOIN dbo.SiconDMSDocumentTypeModule ON dbo.SiconDMSDocument.SiconDMSDocumentTypeModuleID = dbo.SiconDMSDocumentTypeModule.SiconDMSDocumentTypeModuleID
INNER JOIN dbo.SiconDMSDocumentType ON dbo.SiconDMSDocumentTypeModule.SiconDMSDocumentTypeID = dbo.SiconDMSDocumentType.SiconDMSDocumentTypeID
INNER JOIN dbo.SiconDMSModule ON dbo.SiconDMSDocumentTypeModule.SiconDMSModuleID = dbo.SiconDMSModule.SiconDMSModuleID
WHERE
(dbo.SiconDMSDocument.Deleted = 0)
AND (dbo.SiconDMSDocumentAttribute.Deleted = 0)
AND (dbo.SiconDMSAttribute.Deleted = 0)
AND (dbo.SiconDMSDocumentType.Deleted = 0)
AND (dbo.SiconDMSDocumentTypeModule.Deleted = 0)
AND (dbo.SiconDMSModule.Deleted = 0)
Are there any SQL functions that will allow me to do something like this?
I'm not sure what your complicated query has to do with the question of searching for attribute pairs.
Assuming you want the document ids that have both attributes:
select SiconDMSDocumentID
from yourview y
where (AttributeName = 'FileExtension' AND AttributeValue = '.pdf') or
(AttributeName = 'AccountNumber' AND AttributeValue = 'ABB001'
group by SiconDMSDocumentID
having count(*) = 2;
Or, if the attributes could have multiple values:
having count(distinct AttributeName) = 2

Resources