I am attempting to combine a series of linestring geometries with a matching attribute (road_name_search and road_id), and then join merged linestring back to the road_id attribute from the following sample data
http://sqlfiddle.com/#!15/f52d21 (please note - if you copy this to a database, you will need to change the type of the shape column to geometry(MultiLineString,2193)) ... is there a PostGIS version of SQL fiddle?)
I have tried using the code below which works well to combine the road names:
SELECT
MAX (road_id),
road_name_search
FROM a_road_test
GROUP BY road_name_search
This appears to work as expected with the following results:
max road_name_search
3033986 Kennedy Road (Onekawa)
3033986 Kennedy Road (Greenmeadows)
3033986 Kennedy Road (Marewa)
1808281 Kennedy Road (Pyes Pa)
3033986 Kennedy Road (Pirimai)
Next, I tried to include the geometry strings, so I end up with everything I need...When I extend the query to include clustering of the geometry string like so
SELECT
MAX (road_id),
road_name_search,
unnest(ST_ClusterIntersecting(shape))
FROM a_road_test
GROUP BY road_name_search
Then I end up with duplicated road_name_search_fields, i.e. more than one occurrence of each, eg.
1808281 Kennedy Road (Pyes Pa) geoma
3033986 Kennedy Road (Pirimai) geomb
3033986 Kennedy Road (Pirimai) geomc
3033986 Kennedy Road (Onekawa) geomd
3033986 Kennedy Road (Greenmeadows) geome
3033986 Kennedy Road (Greenmeadows) geomf
3033986 Kennedy Road (Marewa) geomg
3033986 Kennedy Road (Marewa) geomh
Next approach, exclude the unnest function...
SELECT
MAX (road_id),
road_name_search,
ST_ClusterIntersecting(shape)
FROM a_road_test
GROUP BY road_name_search
Now I end up with...
1808281 Kennedy Road (Pyes Pa) [geoma]
3033986 Kennedy Road (Pirimai) [geomb, geomc]
3033986 Kennedy Road (Onekawa) [geomd]
3033986 Kennedy Road (Greenmeadows) [geome, geomf]
3033986 Kennedy Road (Marewa) [geomg, geomh]
Just can't quite figure out how to...
1808281 Kennedy Road (Pyes Pa) geoma
3033986 Kennedy Road (Pirimai) geombc
3033986 Kennedy Road (Onekawa) geomd
3033986 Kennedy Road (Greenmeadows) geomef
3033986 Kennedy Road (Marewa) geomgh
Thanks for looking :)
A very simple answer as identified by #ewcz
SELECT MAX(road_id),
road_name_search,
ST_Union(shape)
FROM a_road_test
GROUP BY road_name_search
Related
I am new to coding. Now I have a employee table looked like below:
Name
Hometown
School
Jeff
Illinois
Loyola University Chicago
Alice
California
New York University
William
Michigan
University of Illinois at Chicago
Fiona
California
Loyola University Chicago
Charles
Michigan
New York University
Linda
Indiana
Loyola University Chicago
I am trying to get those employees in pairs where two employees come from different state and different university. Each person can only be in one pair. The expected table should look like
employee1
employee2
Jeff
Alice
William
Fiona
Charles
Linda
The real table is over 3,000 rows. I am trying to do it with SQL or Python, but I don't know where to start.
A straightforward approach is to pick employees one by one and search the table after the one for an appropriate peer; found peers are flagged in order to not be paired repeatedly. Since in your case a peer should be found after a few steps, this iteration will likely be faster than operations which construct whole data sets at once.
from io import StringIO
import pandas as pd
# read example employee table
df = pd.read_table(StringIO("""Name Hometown School
Jeff Illinois Loyola University Chicago
Alice California New York University
William Michigan University of Illinois at Chicago
Fiona California Loyola University Chicago
Charles Michigan New York University
Linda Indiana Loyola University Chicago
"""))
# create expected table; its length is half that of the above
ef = pd.DataFrame(index=pd.RangeIndex(len(df)/2), columns=['employee1', 'employee2'])
k = 0 # number of found pairs, index into expected table
# array of flags for already paired employees
paired = pd.Series(False, pd.RangeIndex(len(df)))
# go through the employee table and collect pairs
for i in range(len(df)):
if paired[i]: continue
for j in range(i+1, len(df)):
if not paired[j] \
and df.iloc[j]['Hometown'] != df.iloc[i]['Hometown'] \
and df.iloc[j]['School'] != df.iloc[i]['School']:
# we found a pair - store it, mark employee j paired
ef.iloc[k] = df.iloc[[i, j]]['Name']
k += 1
paired[j] = True
break
else:
print("no peer for", df.iloc[i]['Name'])
print(ef)
output:
employee1 employee2
0 Jeff Alice
1 William Fiona
2 Charles Linda
i have a doubt and i don't know if it is possible to do. I have a huge table on my database, like a million of records, and i would like to know if there is a way to create a pivot table in Excel and call a query to show data in my workbook previously filtered by a selection, for example:
My Table on my database:
SKU STYLE CATEGORY BRAND STORE COUNTRY
----------------------------------------------------------------------------------
ADIDAS BLUE PANT XL BLUE PANT PANT ADIDAS STORE 1 USA
ADIDAS BLUE PANT L BLUE PANT PANT ADIDAS STORE 1 CANADA
ADIDAS BLUE PANT S BLUE PANT PANT ADIDAS STORE 2 AUSTRALIA
ADIDAS RED HAT XL RED HAT HAT ADIDAS STORE 2 AUSTRALIA
ADIDAS RED HAT L RED HAT HAT ADIDAS STORE 3 USA
ADIDAS RED HAT S RED HAT HAT ADIDAS STORE 3 KONGO
ADIDAS BLACK SHIRT XL BLACK SHIRT SHIRT ADIDAS STORE 2 KONGO
ADIDAS BLACK SHIRT L BLACK SHIRT SHIRT ADIDAS STORE 1 USA
ADIDAS BLACK SHIRT S BLACK SHIRT SHRIT ADIDAS STORE 4 USA
...
.....
......
Before load the entire dataset to Excel i would like to tell the query to filter by store or category... Then, do a Pivot table with in order to let the user chose which columns they want to see.
Take a look at this.
The nice thing about pivoting data in Excel , from another source, is that you can pull in records that far exceed 1,048,576 rows in the source, aggregate records in Excel, and you could still stay well under the 1,048,576 row limit in Excel.
Check out this link as well.
https://www.ptr.co.uk/blog/how-do-you-create-pivot-tables-sql-server-queries
I have an POS (point of sales) database where we store articles, barcodes (eannos) and orderlines (and much more, but only using these now)
I need to get a list of what articles, and what barcodes have been sold, and i'm nearly there. The only issue I can't seem to get right is articles who have more than one barcode on it (somtimes we add multiple barcodes to the article if the product have different colours, but we want to stick to only one articlenumer)
So my SQL query is:
select Description, eanno, eannoid, sum(count) from PurchaseOrderLines
join eannos on PurchaseOrderLines.SizeColorID=EanNos.SizeColorID
where PurchaseOrderLines.articleid in (select articleid from articles where articleno in ('60321129','60314516'))
group by Description, eanno, eannoid
The result is:
Description Eanno Eannoid Sold
Top l/s AOP Baby Dark Sapphire 74 7325850944711 141588 2.00
Top l/s AOP Baby Dark Sapphire 80 7325850944735 141589 2.00
Top l/s AOP Baby Dark Sapphire 86 7325850944759 141590 4.00
Top l/s AOP Baby Dark Sapphire 92 7325850944773 141591 4.00
Bow Tie Solid Preschool Ski Patrol One size 7325851134869 141819 30.00
Bow Tie Solid Preschool Ski Patrol One size 7325851176012 142937 30.00
The last line in the result is a duplicate, there has only been sold 30 of "Bow Tie Solid Preschool Ski Patrol One size" but I'm getting duplicate lines because it shows me each barcorde of that same article, and sums the count from orderlines.
How can I make sure only one record shows?
Data:
PetType LivesIn Name
Fish Lake Tom
Fish Fishbowl Dick
Fish Aquarium Harry
Dog Farm Fido
Dog House Fluffy
Dog Wild Duke
I would like to search on PetType=Fish and only get back the LivesIn facets values Lake,Fishbowl, Aquarium. It seems by default, that the Dog's LivesIn facet values of Farm, House, Wild are returned with zero counts. BTW: Per the docs, I was told to do an fq for Fish so the results use the cache.
I am having a problem selecting a buyers name, products they bought, and price of items. I want the results to just show the buyers name once and list every product/price of said product.
Select distinct Buyer_First_Name, Buyer_Last_Name, Appliance_num, Price
from Buyer inner join Appliance
on buyer.Buyer_num = Appliance.Buyer_num;
This winds up listing the buyers name multiple times, as it is selecting all of the items distinct. The results I am seeing with this code is:
Buyer_First_Name Buyer_First_Name Appliance_num Price
John Smith 000001 $19.99
John Smith 000002 $45.99
John Smith 000003 $12.99
John Smith 000004 $17.99
Mike Brown 000001 $19.99
Mike Brown 000005 $33.99
Mike Brown 000006 $29.99
What I want to see:
Buyer_First_Name Buyer_First_Name Appliance_num Price
John Smith 000001 $19.99
000002 $45.99
000003 $12.99
000004 $17.99
Mike Brown 000001 $19.99
000005 $33.99
000006 $29.99
Thanks.
Hiding the duplicates is something you need to do in the presentation, not in the query. Look for an option in your reporting tool or adjust your code to do this.