I have an employee table in postgres having a JSON column "mobile" in it. It stores JSON Array value ,
e_id(integer) name(char) mobile(jsonb)
1 John [{\"mobile\": \"1234567891\", \"status\": \"verified\"},{\"mobile\": \"1265439872\",\"status\": \"verified\"}]
2 Ben [{\"mobile\": \"6453637238\", \"status\": \"verified\"},{\"mobile\": \"4437494900\",\"status\": \"verified\"}]
I have a search api which queries this table to search for employee using mobile number.
How can I query mobile numbers directly ?
How should I create index on the jsonb column to make query work faster ?
*updated question
You can query like this:
SELECT e_id, name
FROM employees
WHERE mobile #> '[{"mobile": "1234"}]';
The following index would help:
CREATE INDEX ON employees USING gin (mobile);
Related
I have a table that I query like this...
select *
from table
where productId = 'abc123'
Which returns 2 rows (even though the productId is unique) because one of the columns (orderName) is an Array...
**productId, productName, created, featureCount, orderName**
abc123, someProductName, 2020-01-01, 12, someOrderName
, , , , someOtherOrderName
I'm not sure whether the missing values in the 2nd row are empty strings or nulls because of the way the orderName array expands my search results but I want to now run a query like this...
select productName, ARRAY_TO_STRING(orderName,'-')
from table
where productId = 'abc123'
and ifnull(featureCount,0) > 0
But this query returns...
someProductName, someOrderName-someOtherOrderName
i.e. both array values came back even though I specified a condition of featureCount>0.
I'm sure I'm missing something very basic about how Arrays function in BigQuery but from Google's ARRAY_TO_STRING documentation I don't see any way to add a condition to the extracting of ARRAY values. Appreciate any thoughts on the best way to go about this.
For what I understand, this is because you are just querying one row of data which have a column as ARRAY<STRING>. As you are using ARRAY_TO_STRINGS it will only accept ARRAY<STRING> values you will see all array values fit into just one cell.
So, when you run your script, your output will fit your criteria and return the columns with arrays with additional rows for visibility.
The visualization on the UI should look like your mention in your question:
Row
productId
productName
created
featureCount
orderName
1
abc123
someProductName
2020-01-01
12
someOrderName
someOtherOrderName
Note: On bigquery this additional row is gray out ( ) and Its part of row 1 but it shows as an additional row for visibility. So this output only have 1 row in the table.
And the visualization on a JSON will be:
[
{
"productId": "abc123",
"productName": "someProductName",
"created": "2020-01-01",
"featureCount": "12",
"orderName": [
"someOrderName",
"someOtherOrderName"
]
}
]
I don't think there is specific documentation info about how you visualize arrays on UI but I can share the docs that talks about how to flattening your rows outputs into a single row line, check:
Working with Arrays
Flattening Arrays
I use the following to replicate your issue:
CREATE OR REPLACE TABLE `project-id.dataset.working_table` (
productId STRING,
productName STRING,
created STRING,
featureCount STRING,
orderName ARRAY<STRING>
);
insert into `project-id.dataset.working_table` (productId,productName,created,featureCount,orderName)
values ('abc123','someProductName','2020-01-01','12',['someOrderName','someOtherOrderName']);
insert into `project-id.dataset.working_table` (productId,productName,created,featureCount,orderName)
values ('abc123X','someProductNameX','2020-01-02','15',['someOrderName','someOtherOrderName','someData']);
output
Row
productId
productName
created
featureCount
orderName
1
abc123
someProductName
2020-01-01
12
someOrderName
someOtherOrderName
2
abc123X
someProductNameX
2020-01-02
15
someOrderName
someOtherOrderName
someData
Note: Table contains 2 rows.
Let's say I have a table students with a column type jsonb where I store a list with students' additional emails. A student row looks like this
student_id
name
emails
1
John Doe
[j.doe#email.com]
I'm using the following query to update the emails column:
UPDATE students SET emails = emails || '["j.doe#email.com"]'::jsonb
WHERE student_id=1
AND NOT emails #> '["j.doe#email.com"]'::jsonb;
Once the column emails is filled, if I reuse query above with the parameter ["j.doe#email.com", "john#email.com"], the column emails would be update with repeated value:
student_id
name
emails
1
Student 1
[j.doe#email.com, j.doe#email.com, john#email.com]
Is there a way to make sure that in the column emails I'll always have a jsonb list with only unique values ?
Use this handy function which removes duplicates from a jsonb array:
create or replace function jsonb_unique_array(jsonb)
returns jsonb language sql immutable as $$
select jsonb_agg(distinct value)
from jsonb_array_elements($1)
$$;
Your update statement may look like this:
update students
set emails = jsonb_unique_array(emails || '["j.doe#email.com", "john#email.com"]'::jsonb)
where student_id=1
and not emails #> '["j.doe#email.com", "john#email.com"]'::jsonb
Test it in db<>fiddle.
The DB has a table namely records. Among the fields are product_id,quantity,store_id.
I need to get the sum total of quantity of those rows which belong to a certain store_id and then have same product_id values.
For example , the table has values : (2,3,4),(2,1,5),(1 2,2,4) Then I need to get the sum total of quantity along with other columns from 1st and 3rd rows. And the 2nd row will also be present in the result.
Let us assume the Controller name to be RecordController and the model name to be Record.
How should I write the query ?
Edit : the table has values : (2,3,4),(2,1,5),(1 2,2,4)
It can be done using a single query.
Try This:-
DB::table('records')
->select('product_id',DB::raw('sum(quantity) as quantity'),'store_id')
->groupBy('product_id')
->groupBy('store_id')
->get();
Explanation:
This query will fetch all the records & group them by product_id & store_id so will only get a single row for a single product in the store. And then it will display the sum of quantity in the output.
Update:-
Using ORM:-
\App\YourModelName::select('product_id',DB::raw('sum(quantity) as quantity'),'store_id')
->groupBy('product_id')
->groupBy('store_id')
->get();
But you need to write the raw query in any case, it can't be done using default ORM functionality
How can I search any string value from a manipulated column of table like:
SELECT SUBSTR(DESCR,1,8) AS MYDES FROM STATION WHERE descr='ABERDEEN';
I want to search 'ABERDEEN' from SUBSTR(DESCR,1,8) or 'MYDES' column but in my case 'ABERDEEN' is still searched from DESCR column. How can I search it from the new manipulated 'MYDES' column (oracle, sql)?
If you want to filter the rows which has the substr as ABERDEEN in the column descr, then use the function in the filter predicate.
For example,
SELECT * FROM STATION WHERE SUBSTR(DESCR,1,8)='ABERDEEN';
To include the new column in the output along with other columns,
SELECT t.*,
SUBSTR(DESCR,1,8) AS MYDES
FROM STATION
WHERE SUBSTR(DESCR,1,8)='ABERDEEN';
You can use LIKE in where as below :
SELECT SUBSTR(DESCR,1,8) AS MYDES FROM STATION WHERE descr like 'ABERDEEN%';
I have two Hive tables as shown below, along with their columns
Tbl_Customer
Id
Name
Tbl_Cntct
Id
Phone
One Id can have many phone numbers so I have a table
Tbl_All
Id
Name
Phn_List ARRAY
My question is on how to load data from Tbl_Custome and Tbl_Cntct into Tbl_All.
I can do it in PIG, but want to do same in Hive.
Thanks
Insert overwrite table Tbl_All
select cus.id,cus.name,collect_set(ctc.phone)
from Tbl_Customer cus join Tbl_Cntct ctc on cus.id = ctc.id
group by cus.id,cus.name
The collect_set UDAF is a function collects the column into an array with no duplicates.If you want to remain all the value include duplicated ones,use collect_list function