Use peewee to flatten rows into columns - peewee

database backend is postgresqlon gcp.
I have a group of rows in a table that have an id. I am trying to flatten it out in rows with multiple column.
CREATE TABLE public.lines
(
line_no int NOT NULL,
line_content character varying(60) COLLATE pg_catalog."default" NOT NULL,
parent_id integer NOT NULL,
)
with data
(1,'content 1',parent1)
(2,'content 2',parent1)
(3,'content 3',parent1)
...
Trying to figure out a query to flatten the result to columns,
select line1, line2,line3
where parent='parent1'
How to accompolish this?? THANKS!!!!

This doesn't work. Sql databases are tabular. You can't just arbitrarily return rows that are X-columns wide.
Your best bet is to use something like Postgres' array_agg which will return an array. Absent that, you could maybe use group_concat or something to produce a comma-separated list.
But this smells and is probably a horrible idea.
Just do the collapse/flatten in code if you must.

If on PostgreSQL, you're really looking for a "pivot table" like thing. I'd look at the crosstab function (see here.) Which handles things of the format:
name value
A 1
A 2
B 3
B 4
B 5
and produces
A 1 2 3
B 3 4 5

Related

Filtering SQL rows based on certain alphabets combination

I have a column that store user input text field from a frontend website. User can input any kind of text in it, but they will also put in a specific alphabets combination to represent a job type - for example 'dri'. As an example:
Row 1: P49384; Open vehicle bonnet-BO-dri 22/10
Row 2: P93818; Vehicle exhaust-BO 10/20
Row 3: P1933; battery dri-pu-103/2
Row 4: P3193; screwdriver-pu 423
Row 5: X939; seats bo
Row 6: P9381-vehicle-pu-bo dri
In this case, I will like to filter only rows that contain dri. From the example, you can see the text can be in any order (user behaviour, they will key whatever they like without following any kind of format). But the constant is that for a particular job type, they will put in dri.
I know that I can simply use LIKE in SQL Server to get these rows. Unfortunately, row 4 is included inside when I use this operator. This is because screwdriver contains dri.
Is there any way in SQL Server I can do to strictly only obtain rows that has dri job type, while excluding words like screwdriver?
I tried to use PATINDEX but it failed too - PATINDEX('%[d][r][i]%', column) > 0
Thanks in advance.
Your data is the problem here. Unfortunately even for denormalised data it doesn't appear to have a reliable/defined format, making parsing your data in a language like T-SQL next to impossible. What problems are there? Based on the original sample data, at a glance the following problems exist:
The first data value's delimiter isn't consistent. Rows 1-5 use a semicolon (;), but row 6 uses a hyphen (-)
The last data value's delimiter isn't consistent. Row 1, 2 & 4 use a space ( ), but row 3 uses a hyphen (-).
Internal data doesn't use a consistent delimiter. For example:
Row 1 has a the value Open vehicle bonnet-BO-dri, which appears to be the values Open vehicle bonnet, BO and dri; so the hyphen(-) is the delimiter.
Row 5 has seats bo, which appears to be the values seats and bo, so uses a space ( ) as a delimiter.
The fact that row 6 has vehicle as its own value (vehicle-pu-bo-dri), however, implies that Open vehicle bonnet and Vehicle Exhaust (on rows 1 and 2 respectively) could actually be the values Open, vehicle, & bonnet and Vehicle & Exhaust respectively.
Honestly, the solution is to fix your design. As such, your tables should likely look something like this:
CREATE TABLE dbo.Job (JobID varchar(6) CONSTRAINT PK_JobID PRIMARY KEY NONCLUSTERED, --NONCLUSTERED Because it's not always ascending
YourNumericalLikeValue varchar(5) NULL); --Obviously use a better name
CREATE TABLE dbo.JobTypeCompleted(JobTypeID int IDENTITY (1,1) CONSTRAINT PK_JobTypeID PRIMARY KEY CLUSTERED,
JobID varchar(6) NOT NULL CONSTRAINT FK_JobType_Job FOREIGN KEY REFERENCES dbo.Job (JobID),
JobType varchar(30) NOT NULL); --Must likely this'll actually be a foreign key to an actual job type table
GO
Then, for a couple of your rows, the data would be inserted like so:
INSERT INTO dbo.Job (JobID, YourNumericalLikeValue)
VALUES('P49384','22/10'),
('P9381',NULL);
GO
INSERT INTO dbo.JobTypeCompleted(JobID,JobType)
VALUES('P49384','Open vehicle bonnet'),
('P49384','BO'),
('P49384','dri'),
('P9381','vehicle'),
('P9381','pu'),
('P9381','bo'),
('P9381','dri');
Then you can easily get the jobs you want with a simple query:
SELECT J.JobID,
J.YourNumericalLikeValue
FROM dbo.Job J
WHERE EXISTS (SELECT 1
FROM dbo.JobTypeCompleted JTC
WHERE JTC.JobID = J.JobID
AND JTC.JobType = 'dri');
You can apply like operator in your query as column_name like '%-dri'. It means find out records that end with "-dri"

How can I check and remove duplicate rows?

Have problem with quite big table, where are some null values in 3 columns - datetime2 (and 2 float columns).
Nice simple request from similar question returns only 2 rows where datetime2 is null, but nothing else (same as lot of others):
DELETE FROM MyTable
LEFT OUTER JOIN (
SELECT MIN(RowId) as RowId, allRemainingCols
FROM MyTable
GROUP BY allRemainingCols
) as KeepRows ON
MyTable.RowId = KeepRows.RowId
WHERE
KeepRows.RowId IS NULL
Seems to work without datetime2 column having nulls ??
There is manual workaround, but is there any way to create request or procedure using TSQL only ?
SELECT id,remainingColumns
FROM table
order BY remainingColumns
Compare all columns in XL (15 in my case, placed =ROW() in first column as a check and formula next to last column + auto filter for TRUEs): =AND(B1=B2;C1=C2;D1=D2;E1=E2;F1=F2;G1=G2;H1=H2;I1=I2;J1=J2;K1=K2;L1=L2;M1=M2;N1=N2;O1=O2;P1=P2)
Or compare 3 rows like this and select all non-unique rows
=OR(
AND(B1=B2;C1=C2;D1=D2;E1=E2;F1=F2;G1=G2;H1=H2;I1=I2;J1=J2;K1=K2;L1=L2;M1=M2;N1=N2;O1=O2;P1=P2);
AND(B2=B3;C2=C3;D2=D3;E2=E3;F2=F3;G2=G3;H2=H3;I2=I3;J2=J3;K2=K3;L2=L3;M2=M3;N2=N3;O2=O3;P2=P3)
)
Quite much work to find my particular data/answer...
Most of float numbers were slightly different.
Hard to find, but simple CAST(column as binary) can show these invisible differences...
Like 96,6666666666667 vs 0x0000000000000000000000000000000000000000000040582AAAAAAAAAAD vs 0x0000000000000000000000000000000000000000000040582AAAAAAAAAAB etc.
And visible 96.6666666666667 can return something different way again:
0x0000000000000000000000000000000000000F0D0001AB6A489F2D6F0300

SQL Server testing for 1 value in multiple columns

I am testing a table and would like to find out if 10 columns of that table (integer fields) EQUAL the value 999. can ANY or the IN clause be used for this?
At a pure guess, and this is pseudo-SQL, but
SELECT {Columns}
FROM {YourTable}
WHERE '999' IN ({First Column},{Second Column},{Third Column},...,{Tenth Column});
The only way to test this is something like this:
select * from table1 where i1=999 and i1=i2 and i2=i3 and i3=i4 and i4=i5 and i5=16 and i6=i7 and i7=i8 and i8=i9;
where iX are column names. This will return rows which match all the value for all 9 columns.
TSQL is a MS implementation of various SQL standards.

Fine Alphabet in number in SQL

i have a table like this :
CREATE TABLE [Mytable](
[Name] [varchar](10),
[number] [nvarchar](100) )
i want to find [number]s that include Alphabet character?
data must format like this:
Name | number
---------------
Jack | 2131546
Ali | 2132132154
but some time number insert informed and there is alphabet char and other no numeric char in it, like this:
Name | number
---------------
Jack | 2[[[131546ddfd
Ali | 2132*&^1ASEF32154
i wanna find this informed row.
i can't use 'Like' ,because 'Like' make my query very slow.
Updated to find all non numeric characters
select * from Mytable where number like '%[^0-9]%'
Regarding the comments on performance maybe using clr and regex would speed things up slightly but the bulk of the cost for this query is going to be the number of logical reads.
A bit outside the box, but you could do something like:
bulk copy the data out of your table into a flat file
create a table that has the same structure as your original table but with a proper numeric type (e.g. int) for the [number] column.
bulk copy your data into this new table, making sure to specify a batch size of 1 and an error file (where rows that won't fit the schema will go)
rows that end up in the error file are the rows that have non-numerics in the [number] column
Of course, you could do the same thing with a cursor and a temp table or two...

MS Access row number, specify an index

Is there a way in MS access to return a dataset between a specific index?
So lets say my dataset is:
rank | first_name | age
1 Max 23
2 Bob 40
3 Sid 25
4 Billy 18
5 Sally 19
But I only want to return those records between 'rank' 2 and 4, so my results set is Bob, Sid and Billy? However, Rank is not part of the table, and this should be generated when the query is run. Why don't I use an autogenerated number, because if a record is deleted, this will be inconsistent, and what if I wanted the results in reverse!
This obviously very simple, and the reason I ask is because I am working on a product catalogue and I am looking for a more efficient way of paging through the returned dataset, so if I only return 1 page worth of data from the database this is obviously going to be quicker then return a complete set of 3000 records and then having to subselect from that set!
Thanks R.
Original suggestion:
SELECT * from table where rank BETWEEN 2 and 4;
Modified after comment, that rank is not existing in structure:
Select top 100 * from table;
And if you want to choose subsequent results, you can choose the ID of the last record from the first query, say it was ID 101, and use a WHERE clause to get the next 100;
Select top 100 * from table where ID > 100;
But these won't give you what you're looking for either, I bet.
How are you calculating rank? I assume you are basing it on some data in another dataset somewhere. If so, create a function, do a table join, or do something that can calculate rank based on values in other table(s), then you can do queries based on the rank() function.
For example:
select *
from table
where rank() between 2 and 4
If you are not calculating rank based on some data somewhere, there really isn't a way to write this query, and you might as well be returning three random rows from the table.
I think you need to use a correlated subquery to calculate the rank on the fly e.g. I'm guessing the rank is based on name:
SELECT T1.first_name, T1.age,
(
SELECT COUNT(*) + 1
FROM MyTable AS T2
WHERE T1.first_name > T2.first_name
) AS rank
FROM MyTable AS T1;
The bad news is the Access data engine is poorly optimized for this kind of query; in my experience, performace will start to noticeably degrade beyond a few hundred rows.
If it is not possible to maintain the rank on the db side of the house (e.g. high insertion environment) consider doing the paging on the client side. For example, an ADO classic recordset object has properties to support paging (PageCount, PageSize, AbsolutePage, etc), something for which DAO recordsets (being of an older vintage) have no support.
As always, you'll have to perform your own timings but I suspect that when there are, say, 10K rows you will find it faster to take on the overhead of fetching all the rows to an ADO recordset then finding the page (then perhaps fabricate smaller ADO recordset consisting of just that page's worth of rows) than it is to perform a correlated subquery to only fetch the number of rows for the page.
Unfortunately the LIMIT keyword isn't available in MS Access -- that's what is used in MySQL for a multi-page presentation. If you can write an order key into the results table, then you can use it something like this:
SELECT TOP 25 MyOrder, Etc FROM Table1 WHERE MyOrder in
(SELECT TOP 55 MyOrder FROM Table1 ORDER BY MyOrder DESC)
ORDER BY MyOrder ASCENDING
If I understand you correctly, there is ionly first_name and age columns in your table. If this is the case, then there is no way to return Bob, Sid, and Billy with a single query. Unless you do something like
SELECT * FROM Table
WHERE FirstName = 'Bob'
OR FirstName = 'Sid'
OR FirstName = 'Billy'
But I think that this is not what you are looking for.
This is because SQL databases make no guarantee as to the order that the data will come out of the database unless you specify an ORDER BY clause. It will usually come out in the same order it was added, but there are no guarantees, and once you get a lot of rows in your table, there's a reasonably high probability that they won't come out in the order you put them in.
As a side note, you should probably add a "rank" column (this column is usually called id) to your table, and make it an auto incrementing integer (see Access documentation), so that you can do the query mentioned by Sev. It's also important to have a primary key so that you can be certain which rows are being updated when you are running an update query, or which rows are being deleted when you run a delete query. For example, if you had 2 people named Max, and they were both 23, how you delete 1 row without deleting the other. If you had another auto incrementing unique column in there, you could specify the unique ID in your query to delete only one.
[ADDITION]
Upon reading your comment, If you add an autoincrement field, and want to read 3 rows, and you know the ID of the first row you want to read, then you can use "TOP" to read 3 rows.
Assuming your data looks like this
ID | first_name | age
1 Max 23
2 Bob 40
6 Sid 25
8 Billy 18
15 Sally 19
You can wuery Bob, Sid and Billy with the following QUERY.
SELECT TOP 3 FirstName, Age
From Table
WHERE ID >= 2
ORDER BY ID

Resources