PostgreSQL - Check column value and update after removing any symbols - database

I have large amount of data in a table 'Users'. The 'username' field contains string values but users has input symbols in it e.g. im-a-user, you_are_user, etc.
How can I clean that columns data using SQL query ?
Users:
I want to clean the values in Username column so that they should look like, imauser, imanotheruser and andmoreuser, etc.

Use regexp_replace()
select id,
regexp_replace(lower(username), '[^a-z]', '', 'gi') as clean_user_name
from users;

Related

How to filter "show tables"

I would like to filter the output of show tables.
The documentation has one example on how to do this using result_scan(last_query_id()), but for me the example does not work:
show tables;
select "schema_name", "name" as "table_name", "rows"
from table(result_scan(last_query_id()))
where "rows" = 0;
-- SQL compilation error: error line 1 at position 8 invalid identifier 'SCHEMA_NAME'
The column SCHEMA_NAME is actually in the output of show tables,
so I do not understand what is wrong.
Best,
Davide
Run the following on your account and see what it is set to:
show parameters like 'QUOTED_IDENTIFIERS_IGNORE_CASE';
If this is set to TRUE, then it is ignoring the quotes in your query, which will then uppercase the column names, which won't match to the lowercase names of the SHOW output.
To resolve for your own session, you can run the following:
ALTER SESSION SET QUOTED_IDENTIFIERS_IGNORE_CASE = False;
You can also change this at a user or account level, if you wish. Setting this value to TRUE isn't recommended for the reason that you are running into.
You can reference the filter column using $<col_n> syntax (#8 for rows).
Example:
show tables;
select *
from table(result_scan())
where $8 > 5
That being said, your query worked for me.

BigQuery or SQL Server SPLIT query

I have searched around and can not find much on this topic. I have a table, that gets logging information. As a result the column I am interested in contains multiple values that I need to search against. The column is formatted in a php URL style. i.e.
/test/test.aspx?DS_Vendor=55039&DS_ProdVer=7.90.100.0&DS_ProdLang=EN&DS_Product=MTT&DS_OfficeBits=32
This makes all searches end up with really long regexes to get data. Then join statements to combine data.
Is there a way in BigQuery, or SQL Server that I can pull the information from that column and put it into new columns?
Example:
The information I would like extracted begins after the ?, and ends at &, The string can sometimes be longer, and contains additional headers.
Thanks,
Below is for BigQuery Standard SQL and addresses below aspect of your question
Is there a way in BigQuery, ... that I can pull the information from that column and put it into new columns?
#standardSQL
CREATE TEMP FUNCTION parseColumn(kv STRING, column_name STRING) AS (
IF(SPLIT(kv, '=')[OFFSET(0)]= column_name, SPLIT(kv, '=')[OFFSET(1)], NULL)
);
WITH `project.dataset.table` AS (
SELECT '/test/test.aspx?extra=abc&DS_Vendor=55039&DS_ProdVer=7.90.100.0&DS_ProdLang=EN&DS_Product=MTT&DS_OfficeBits=32' AS url UNION ALL
SELECT '/test/test.aspx?DS_Vendor=55192&DS_ProdVer=4.30.100.0&more=123&DS_ProdLang=DE&DS_Product=MTE&DS_OfficeBits=64'
)
SELECT
MIN(parseColumn(kv, 'DS_Vendor')) AS DS_Vendor,
MIN(parseColumn(kv, 'DS_ProdVer')) AS DS_ProdVer,
MIN(parseColumn(kv, 'DS_ProdLang')) AS DS_ProdLang,
MIN(parseColumn(kv, 'DS_Product')) AS DS_Product,
MIN(parseColumn(kv, 'DS_OfficeBits')) AS DS_OfficeBits
FROM `project.dataset.table`,
UNNEST(REGEXP_EXTRACT_ALL(url, r'[?&]([^?&]+)')) AS kv
GROUP BY url
with the result as below
Row DS_Vendor DS_ProdVer DS_ProdLang DS_Product DS_OfficeBits
1 55039 7.90.100.0 EN MTT 32
2 55192 4.30.100.0 DE MTE 64
Below is also addressed
The string can sometimes be longer, and contains additional headers.
One example using BigQuery (with standard SQL):
SELECT REGEXP_EXTRACT_ALL(url, r'[?&]([^?&]+)')
FROM (
SELECT '/test/test.aspx?DS_Vendor=55039&DS_ProdVer=7.90.100.0&DS_ProdLang=EN&DS_Product=MTT&DS_OfficeBits=32' AS url
)
This returns the parts of the URL as an ARRAY<STRING>. To go one step further, you can get back an ARRAY<STRUCT<key STRING, value STRING>> with a query of this form:
SELECT
ARRAY(
SELECT AS STRUCT
SPLIT(part, '=')[OFFSET(0)] AS key,
SPLIT(part, '=')[OFFSET(1)] AS value
FROM UNNEST(REGEXP_EXTRACT_ALL(url, r'[?&]([^?&]+)')) AS part
) AS keys_and_values
FROM (
SELECT '/test/test.aspx?DS_Vendor=55039&DS_ProdVer=7.90.100.0&DS_ProdLang=EN&DS_Product=MTT&DS_OfficeBits=32' AS url
)
...or with the keys and values as top-level columns:
SELECT
SPLIT(part, '=')[OFFSET(0)] AS key,
SPLIT(part, '=')[OFFSET(1)] AS value
FROM (
SELECT '/test/test.aspx?DS_Vendor=55039&DS_ProdVer=7.90.100.0&DS_ProdLang=EN&DS_Product=MTT&DS_OfficeBits=32' AS url
)
CROSS JOIN UNNEST(REGEXP_EXTRACT_ALL(url, r'[?&]([^?&]+)')) AS part

Oracle ROWTOCOL Function oddities

I have a requirement to pull data in a specific format and I'm struggling slightly with the ROWTOCOL function and was hoping a fresh pair of eyes might be able to help.
I'm using 10g Oracle DB (10.2) so LISTAGG which appears to do what I need to achieve is not an option.
I need to aggregate a number of usernames into a string delimited with a '$' but I also need to concatenate another column to to build up email addresses.
select
rowtocol('select username_id from username where user_id = '||s.user_id|| 'order by USERNAME_ID asc','#'||d.domain_name||'$')
from username s, domain d
where s.user_id = d.user_id
(I've simplified the query specific to just this function as the actual query is quite large and all works except for this particular function.)
in the DOMAIN Table I have a number of domains such as 'hotmail.com','gmail.com' etc
I need to concatenate the username, an '#' symbol followed by the domain and all delimited with a '$'
such as ......
joe.bloggs#gmail.com$joeblogs#gmail.com$joe_bloggs#gmail.com
I've battled with this and I've got close but in reverse?!.....
gmail.com$joe.bloggs#gmail.com$joeblogs#gmail.com$joe_bloggs
I've also noticed that if I play around with the delimiter (,'#'||d.domain_name||'$') it has a tendency to drop off the first character as can be seen above the preceding '#' has been dropped from the first email address.
Can anyone offer any suggestions as to how to get this working?
Many Thanks in advance!
Assuming you're using the rowtocol function from OTN, and have tables something like:
create table username (user_id number, username_id varchar2(20));
create table domain (user_id number, domain_name varchar2(20));
insert into username values (1, 'joe.bloggs');
insert into username values (1, 'joebloggs');
insert into username values (1, 'joe_bloggs');
insert into domain values (1, 'gmail.com');
Then your original query gets three rows back:
gmail.com$joe.bloggs
gmail.com$joe_bloggs#gmail.com$joebloggs
gmail.com$joe_bloggs#gmail.com$joebloggs
You're passing the data from each of your user IDs to a separate call to rowtocol, which isn't really what you want. You can get the result I think you're after by reversing it; pass the main query that joins the two tables as the select argument to the function, and have that passed query do the username/domain concatenation - that is a separate step to the string aggregation:
select
rowtocol('select s.username_id || ''#'' || d.domain_name from username s join domain d on d.user_id = s.user_id', '$')
from dual;
which gets a single result:
joe.bloggs#gmail.com$joe_bloggs#gmail.com$joebloggs#gmail.com
Whether that fits into your larger query, which you haven't shown, is a separate question. You might need to correlate it with the rest of your query.
There are other ways to string aggregation in Oracle, but this function is one way, and you already have it installed. I'd look at alternatives though, such as ThomasG's answer, which make it a bit clearer what's going on I think.
As Alex told you in comments, this ROWTOCOL isn't a standard function so if you don't show its code, there's nothing we can do to fix it.
However you can accomplish what you want in Oracle 10 using the XMLAGG built-in function.
try this :
SELECT
rtrim (xmlagg (xmlelement (e, s.user_id || '#' || d.domain_name || '$')).extract ('//text()'), '$') whatever
FROM username s
INNER JOIN domain d ON s.user_id = d.user_id

SQLITE3, copy data to a new DB file with a different TABLE Schema

I have 2 different SQLITE3 DB files with following RESULT TABLES:
DB1.db:
result ("ID","Name")
DB2.db
result ("ID","City","Town","Name")
How can I copy data from DB1.result table into DB2.result table with fixed values City=city1, Town=town1
Any solution with SQL commands, or a script solution with any language is more than welcome.
To get a result in the desired form, you can use fixed values with SELECT:
SELECT ID, 'city1', 'town1', Name FROM result;
To copy between two databases, you can ATTACH one to the other:
ATTACH 'DB2.db' AS db2;
... and then copy between the tables:
INSERT INTO db2.result(ID, City, Town, Name)
SELECT ID, 'city1', 'town1', Name
FROM main.result;

How to populate select box with the session values in CakePHP

I have stored many employee objects in session. how to populate the select box with the name of the all employees.i am very new to cakephp so explain me the syntax.
Assuming that your employee objects in your session are in the following format
array('Employee.id' => 'Employee.name') you can populate the selectbox with the following statement in your views: $this->Form->select('employees', $employees) (where $employees is an array formatted as above).
See also http://book.cakephp.org/view/1430/select for more info on the syntax of the select box.

Resources