Use of order
Apartment Canada Toronto
38 37 2042 37
Appartment Building Apt
54 357
can you help for making capital letters for the charcaters in my array?
Try this:
names(ordered_use) <-toupper(names(ordered_use))
> ordered_use
APARTMENT APARTMENT BLDG APARTMENT BUILDING APARTMENT UNIT APPARTMENT BUILDING
38 37 2042 37 54
APT
357
Related
I have a report in excel that displays the sales results from each employee. The columns are Location, Region, Username & Sales. It is sorted by Sales descending, showing which employee has the best sales in the company.
I am attempting to have an additional sheet per region that displays the results for all employees in that region also sorted by Sales (to avoid sorting the results of the many regions myself everyday).
An example version of the first 12 rows of the Data Sheet:
G H I J K X
Row Location Username Sales Region Region
1 38 John.Doe 85 North1 North1
2 154 John.Smith 83 South2
3 23 E.Williams 83 North1
4 210 M.Williams 79 East5
5 139 Joe.Dawn 77 North2
6 22 Kay.Smith 69 South2
7 51 Jay.Smith 69 South2
8 125 L.Smith 69 East2
9 51 L.Day 69 South2
10 23 23.Guest2 67 North1
11 92 U.Goode 65 North4
I have successfully created an array function that pulls the Sales column of only the results in the specified region.
{=LARGE(SMALL(IF(IF(ISERROR(K:K),"",K:K)=$X$2,J:J),
ROW(INDIRECT("1:"&COUNTIF(K:K,$X$2)))),F2)}
I am attempting now for an array function that pulls the Username that matches the corresponding sales amount in the original array, and also matches the region. I am having trouble when a single region has 'ties' or more than one employee with the same sales that month. Here is what I started with for that function:
=INDEX(I:I,MATCH(1,(Y2=J:J)*($X$1=K:K),0)
but that is having trouble when a single region has multiple users with the same sales. So I am trying a conditional to accomodate, with the function I know that works for singles when there's only one of that sales for that region.
{=IF(COUNTIF($AB$2:AB2,AB2)>1,
INDEX(I:I,
SMALL(IF(J:J=AB2,
IF(K:K=$AB$2,ROW(K:K)-ROW(INDEX(K:K,1,1))+1)),
COUNTIF($AB$2:AB2,AB2))),
INDEX(I:I,MATCH(1,(AC2=J:J)*($AB$2=K:K),0)))}
The inner piece may be sufficient if it worked, excluding the need for the conditional:
{=INDEX(I:I,
SMALL(IF(J:J=AB2,
IF(K:K=$AB$2,ROW(K:K)-ROW(INDEX(K:K,1,1))+1)),
COUNTIF($AB$2:AB2,AB2)))}
I'll use the same function for Username.
Expected results for two regions:
X Y Z AA AB AC AD AE
Region Sales Username Location Region Sales Username Location
North1 85 John.Doe 38 South2 83 John.Smith 154
83 E.Williams 23 69 Kay.Smith 22
67 23.Guest2 23 69 Jay.Smith 51
69 L.Day 51
Since beginning to type this question I have found a work around that includes a few additional columns to complete the calculation, but still wanted to ask this to see if it was possible for knowledge's sake.
With North1 in X2, these are the formulas for Y2:AA2.
=IFERROR(AGGREGATE(14, 6, ($J$2:$J$999)/($K$2:$K$999=X$2), ROW(1:1)), "")
=IFERROR(INDEX($H:$H, AGGREGATE(15, 6, ROW($2:$999)/(($K$2:$K$999=X$2)*($J$2:$J$999=Y2)), COUNTIF(Y$2:Y2, Y2))), "")
=IFERROR(INDEX($H:$H, AGGREGATE(15, 6, ROW($2:$999)/(($K$2:$K$999=X$2)*($J$2:$J$999=Y2)), COUNTIF(Y$2:Y2, Y2))), "")
Fill down as necessary.
With South2 in AB2, copy Y2:AA2 to AC2:AE2 and fill down as necessary.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
ordered_use
2 UNIT CONVERTED DWELLING
28706 51
2 UNIT DWELLING 2 UNITS
99 44
3 UNIT DWELLING APARTMENT
31 4733
APARTMENT APARTMENT BLDG
38 37
APARTMENT BUILDING APARTMENT UNIT
2042 37
APPARTMENT BUILDING APT
54 357
APT BLDG APT BUILDING
78 49
APT. APT. BLDG
41 61
APT. BUILDING ARENA
35 67
BANK BOWLING ALLEY
302 267
BUNGALOW CAR DEALERSHIP
85 62
CHURCH CLUB
94 40
COLLEGE COMMERCIAL
196 410
COMMERCIAL/RESIDENTIAL COMMUNITY CENTRE
56 131
COMMUNITY HALL CONDO
31 223
CONDOMINIUM CONVERTED DWELLING
42 42
CONVERTED HOUSE CONVERTED HOUSE - 2 UNITS
149 124
CONVERTED HOUSE - 3 UNITS CONVERTED HOUSE (2 UNITS)
56 35
CONVERTED HOUSE 2 UNITS CONVERTED HOUSE, 2 UNITS
38 84
CONVERTED HOUSE, 3 UNITS DAYCARE
42 31
DENTAL OFFICE DETACHED - SFD
87 513
DETACHED - SINGLE FAMILY DWELLING DETACHED HOUSE
97 130
DETACHED SFD DUPLEX
190 145
ELEMENTARY SCHOOL FIRE HALL
859 41
FITNESS CENTRE FUNERAL HOME
48 36
GARAGE GAS STATION
63 130
GROCERY STORE GROUP HOME
51 45
HAIR SALON HOME FOR THE AGED
46 49
HOSPITAL HOTEL
971 215
HOUSE IND
1249 219
INDUSTRIAL INDUSTRIAL
1725 35
INDUSTRIAL BUILDING INDUSTRIAL MANUFACTURING
51 91
INDUSTRIAL WAREHOUSE INSTITUTIONAL
61 48
LAB LABORATORY
56 46
LIBRARY LONG TERM CARE FACILITY
91 74
LUMBER YARD MANUFACTURING
53 55
MEDICAL OFFICE MIXED USE
247 539
MIXED USE MIXED USE (COMMERCIAL)
40 34
MIXED USE (RETAIL) MIXED USE BUILDING
74 297
MIXED USE BUILDING/NON RESIDENTIAL MIXED USE NON RES
37 93
MIXED USE NON RES (RETAIL) MIXED USE RES & NON RES
52 59
MIXED-USE MULTI UNIT
202 54
MULTI UNIT BUILDING MULTI USE
70 381
MULTI USE, NON RES MULTI USE/NON RES
134 36
MULTI USE/NON RESIDENTIAL MULTIPLE UNIT
49 149
MULTIPLE UNIT BUILDING MUSEUM
40 40
N/A NONE
650 264
NOT KNOWN NURSING HOME
58 55
OFF OFFICE
181 9698
OFFICE OFFICE BLD
50 46
OFFICE BUILDING OFFICE SPACE
177 39
OFFICE/RETAIL OFFICE/WAREHOUSE
36 95
OFFICES OTHER
63 54
PARK PARKING GARAGE
137 149
PARKING LOT PERSONAL SERVICE SHOP
126 49
PLACE OF WORSHIP POLICE STATION
516 34
PROF. OFFICE RECREATIONAL
65 46
REPAIR GARAGE RES
91 242
RESIDENTIAL RESIDENTIAL - SFD
488 561
RESIDENTIAL CONDO REST
39 46
RESTAURANT RESTAURANT > 30 SEATS
1074 42
RESTAURANT GREATER THAN 30 SEATS RESTAURANT LESS THAN 30 SEATS
145 69
RESTAURANT UNDER 30 SEATS RESTAURANT, GREATER THAN 30 SEATS
47 42
RET RETAIL
81 4001
RETAIL RETAIL MALL
46 61
RETAIL PLAZA RETAIL STORE
96 796
RETAIL/OFFICE RETAIL/RESIDENTIAL
32 99
ROOMING HOUSE ROW HOUSE
89 38
SCHOOL SECONDARY SCHOOL
594 246
SEMI SEMI DETACHED
209 218
SEMI DETACHED - SFD SEMI DETACHED - SINGLE FAMILY DWELLING
212 50
SEMI DETACHED SFD SEMI-DETACHED
46 71
SEMI-DETACHED - SFD SEMI-DETACHED DWELLING
241 172
SEMI-DETACHED HOUSE SEMI-DETACHED SFD
56 155
SEMI-DETACHED SINGLE FAMILY DWELLING SFD
90 26479
SFD - DETACHED SFD - DETCAHED
3817 79
SFD - ROWHOUSE SFD - SEMI
76 206
SFD - SEMI DETACHED SFD - SEMI-DETACHED
353 209
SFD - SEMIDETACHED SFD - TOWNHOUSE
158 131
SFD DET SFD DETACEHD
495 39
SFD DETACHED SFD DETATCHED
755 231
SFD ROWHOUSE SFD SEMI
31 857
SFD SEMI DETACHED SFD SEMI-DETACHED
59 167
SFD TOWNHOUSE SFD-DETACHED
155 8148
SFD-DETACHED SFD-ROWHOUSE
37 56
SFD-SEMI SFD-SEMI DETACHED
1189 613
SFD-SEMI-DETACHED SFD-TOWNHOUSE
313 526
SINGLE SINGLE FAMILY
41 64
SINGLE FAMILY DETACHED SINGLE FAMILY DETACHED DWELLING
1615 222
SINGLE FAMILY DETACHED HOUSE SINGLE FAMILY DWELLING
58 2673
SINGLE FAMILY SEMI-DETACHED SINGLE-FAMILY DETACHED HOUSE
54 107
SINGLE-FAMILY SEMI-DETACHED HOUSE STADIUM
53 37
STUDENT RESIDENCE SUBWAY STATION
34 44
SURFACE PARKING LOT/EXISTING COMMERCIAL BUILDING TAKE OUT RESTAURANT
57 34
THEATRE TOWNHOUSE
38 198
TOWNHOUSE - SFD TOWNHOUSES
97 31
TRANSIT STATION TRIPLEX
70 54
UNION STATION UNIVERSITY
52 359
UNIVERSITY OF TORONTO VACANT
42 15010
VACANT VACANT (AFTER DEMO)
77 36
VACANT COMMERCIAL VACANT COMMERCIAL UNIT
37 63
VACANT INDUSTRIAL VACANT LAND
32 1107
VACANT LOT VACANT RETAIL
447 112
VACANT RETAIL UNIT VACANT SINGLE FAMILY DWELLING
46 82
VACANT SPACE VACANT UNIT
120 117
VACNT WAREHOUSE
42 526
WAREHOUSE/OFFICE WATER TREATMENT PLANT
54 46
Apartment <- (ordered_use[6]+ ordered_use[7]+ ordered_use[8] + ordered_use[9] + ordered_use[10] + ordered_use[11] + ordered_use[12] + ordered_use[13] + ordered_use[14] + ordered_use[15] + ordered_use[16] + ordered_use[17] + ordered_use[30] + ordered_use[31] + ordered_use[33] + ordered_use[34] + ordered_use[35] + ordered_use[36] + ordered_use[37] + ordered_use[38] + ordered_use[39] + ordered_use[84] + ordered_use[85] + ordered_use[90] + ordered_use[91])
I am trying to convert anything that looks like an apartment,building,condo, unit and etc therefore I combined everything which looks similar but my question is, how can I replace those with my combined data of Apartment
To get something to work with I pasted your text into the space between the quotes of:
ordered_use <- read.fwf(textConnection("___"), widths=c(50,50), stringsAsFactors=FALSE)
And then trimmed blank-space and extracted every other row of the odd items and applied as.numeric to the even rows>
ordered_use[] <- lapply(ordered_use, trim)
ord2 <- data.frame(
nams <- c( ordered_use[ c(TRUE,FALSE), "V1"], ordered_use[ c(TRUE,FALSE), "V2"]),
nums=as.numeric(c( ordered_use[ c(FALSE,TRUE), "V1"], ordered_use[ c(FALSE,TRUE), "V2"]) )
> head(ord2)
nams nums
1 28706
2 2 UNIT DWELLING 99
3 3 UNIT DWELLING 31
4 APARTMENT 38
5 APARTMENT BUILDING 2042
6 APPARTMENT BUILDING 54
To extract items with "APT" or "CONDO" use grepl
> ord2[ grepl("APART|APPART|APT|CONDO", ord2$nams) , ]
nams nums
4 APARTMENT 38
5 APARTMENT BUILDING 2042
6 APPARTMENT BUILDING 54
7 APT BLDG 78
8 APT. 41
9 APT. BUILDING 35
16 CONDOMINIUM 42
60 RESIDENTIAL CONDO 39
110 APARTMENT 4733
111 APARTMENT BLDG 37
112 APARTMENT UNIT 37
113 APT 357
114 APT BUILDING 49
115 APT. BLDG 61
122 CONDO 223
I cannot tell whether your item numbers match up since you probably have a table object and I have two columns that are not arranges the same as yours.
> sum( ord2[ grepl("APART|APPART|APT|CONDO", ord2$nams) ,"nums" ])
[1] 7866
You should post the output of dput(head(ordered_use, 20)) if you want an answer tailored to the type of object you have.
I have a dataset with over 900 observations, each observation represents the population of a sub-geographical area for a given year by gender (male, female, all) and 20 different age groups.
I have dropped the variable for the sub-geographical area and I want to collape into the greater geographical area (called Geo).
I am having a difficult time doing a SUM or PROC MEANS because I have so many age groups to sum up and I am trying to avoid writing them all out. I want to collapse across the group year, geo, sex so that I only have 3 observations per Geo (my raw data could have as many as 54 observations).
This is an example of what a tiny section of the raw data looks like:
Year Geo Sex Age0005 Age0610 Age1115 (etc)
2010 1 1 92 73 75
2010 1 2 57 81 69
2010 1 3 159 154 144
2010 1 1 41 38 43
2010 1 2 52 41 39
2010 1 3 93 79 82
2010 2 1 71 66 68
2010 2 2 63 64 70
2010 2 3 134 130 138
2010 2 1 32 35 34
2010 2 2 29 31 36
2010 2 3 61 66 70
This is how I want it to look:
Year Group Sex Age0005 Age0610 Age1115 (etc)
2010 1 1 133 111 118
2010 1 2 109 122 08
2010 1 3 252 233 226
2010 2 1 103 101 102
2010 2 2 92 95 106
2010 2 3 195 196 208
Any ideas? Please help!
You don't have to write out each variable name individually - there are ways of getting around that. E.g. if all of the age group variables that need to be summed up start with age then you can use a : wildcard to match them:
proc summary nway data = have;
var age:;
class year geo sex;
output out = want sum=;
run;
If your variables don't have a common prefix, but are all next to each other in one big horizontal group in your dataset, you can use a double dash list instead:
proc summary nway data = have;
var age005--age1115; /*Includes all variables between these two*/
class year geo sex;
output out = want sum=;
run;
Note also the use of sum= - this means that each summarised variable is reproduced with its original name in the output dataset.
I personally like to use proc sql for this, since it makes it very clear what you're summing and grouping by.
data old ;
input Year Geo Sex Age0005 Age0610 Age1115 ;
datalines;
2010 1 1 92 73 75
2010 1 2 57 81 69
2010 1 3 159 154 144
2010 1 1 41 38 43
2010 1 2 52 41 39
2010 1 3 93 79 82
2010 2 1 71 66 68
2010 2 2 63 64 70
2010 2 3 134 130 138
2010 2 1 32 35 34
2010 2 2 29 31 36
2010 2 3 61 66 70
;
run;
proc sql ;
create table new as select
year
, geo label = 'Group'
, sex
, sum(age0005) as age0005
, sum(age0610) as age0610
, sum(age1115) as age1115
from old
group by geo, year, sex ;
quit;
I have to show my table data in sort order by design_no
Here is my data
design_no fname meter rate s m l xl
---------------------------------------------------------------
3092 2111-1 432.00 235.00 32 33 21 21
3092 2111-1 498.75 235.00 38 37 24 24
3092 2111-1 460.50 235.00 31 35 23 24
3092 2111 501.75 245.00 37 38 25 24
I want show it like this..
design_no fname meter rate pcs
---------------------------------------------------
3092 2111 501.75 245.00 124
3092 2111-1 1391.25 235.00 343
Kindy help me
SELECT design_no,fname,SUM(meter),rate,SUM(s)+SUM(m)+SUM(l)+SUM(xl)
FROM tab
GROUP BY design_no,fname,rate
What behaviour do you want if the rate is different for the same design_no and fname?
I am trying to set a variable equal to state fips codes given its state abbreviation. Is there a shorter way to do this other than:
replace fips = "[fips code]" if other_variable=="[state_abbrev]"
Which I currently have 50 lines of. I would like to create a loop, but given that I have two changing values, I don't know how to avoid looping through every permutation.
Here is an example of the strategy covered in the FAQ.
1) Create a dataset containing two variables: the state name and the associated fips code. To make this slightly more flexible, I include common semi-abbreviations for the state name. In the future, you could add a third variable that includes the two-letter state abbreviation.
clear
input fips str20 state
1 "alabama"
2 "alaska"
4 "arizona"
5 "arkansas"
6 "california"
8 "colorado"
9 "connecticut"
10 "delaware"
11 "district of columbia"
12 "florida"
13 "georgia"
15 "hawaii"
16 "idaho"
17 "illinois"
18 "indiana"
19 "iowa"
20 "kansas"
21 "kentucky"
22 "louisiana"
23 "maine"
24 "maryland"
25 "massachusetts"
26 "michigan"
27 "minnesota"
28 "mississippi"
29 "missouri"
30 "montana"
31 "nebraska"
32 "nevada"
33 "new hampshire"
34 "new jersey"
35 "new mexico"
36 "new york"
37 "north carolina"
37 "n. carolina"
38 "north dakota"
38 "n. dakota"
39 "ohio"
40 "oklahoma"
41 "oregon"
42 "pennsylvania"
44 "rhode island"
45 "south carolina"
45 "s. carolina"
46 "south dakota"
46 "s. dakota"
47 "tennessee"
48 "texas"
49 "utah"
50 "vermont"
51 "virginia"
53 "washington"
54 "west virginia"
54 "w. virginia"
55 "wisconsin"
56 "wyoming"
72 "puerto rico"
end
save statefips, replace
2) Load your primary dataset that holds a variable with state names and perform a many-to-one merge using statefips.dta.
sysuse census, clear
// Convert the state names to lowercase to ensure
// consistency with the statefips dataset
replace state = lower(state)
merge m:1 state using statefips.dta
drop if _merge == 2
drop _merge
If you wanted to preserve the case of the state names in your master data set, you could simply generate a temporary variable and use that for the merge, i.e.
gen statelower = lower(state)
merge m:1 statelower using statefips.dta
Also, once you've created the statefips.dta data set, there's no need to recreate it every time you want to perform a merge. You could simply bundle it along with your project's files and use it when necessary. If you find you want to add two-letter state abbreviations or make some other change, then it's practically instantaneous to recreate it.
No obvious shortcut, but in Stata
. search merge, faq
to find a relevant FAQ by Kit Baum.