Plot Polygons with Carto - maps

I have the following data file, where each row have a coordinate file:
id - identifies the polygon
barrio - you can ignore(is the name of the place)
volume - the scale that will be the color of the polyhon
long - longitude
lat - latitude
0 Pueblo Nuevo 32721.5 -3.6449779397 40.4307339003
0 Pueblo Nuevo 32721.5 -3.64510182294 40.4307061424
0 Pueblo Nuevo 32721.5 -3.64534323472 40.4306011803
0 Pueblo Nuevo 32721.5 -3.64558445341 40.4304737628
0 Pueblo Nuevo 32721.5 -3.64582154951 40.4303417733
0 Pueblo Nuevo 32721.5 -3.64594391235 40.4302735093
1 Palacio 24301.5 -3.71015464172 40.4229425859
1 Palacio 24301.5 -3.7102954769 40.4228156123
1 Palacio 24301.5 -3.71057024411 40.4225706548
1 Palacio 24301.5 -3.71060800746 40.422516382
1 Palacio 24301.5 -3.71066930547 40.4224934781
Using Carto, is it possible to plot the 2 polygons (similar to this: http://sensitivecities.com/images/london_plaque_density.png)?
I tried to do with analysis/group polygons but it looks strange. Also there is superposition of the vertices.

That looks like a combination of:
Create point geometries from coordinates
Aggregate by the id, but you lack an ordering field.
Assuming you can group and order your points then you can create a line, and then a polygon.
Of course, all this need to be made with SQL and PostGIS functions. Alternatively, you can break the process in two parts, with CARTO BUILDER you can you can get the lines but there is not an analysis for building areas so you should need to save them as a new table and then create the polygons with ST_BuildArea or ST_MakePolygon.

Related

Drawing forest plot Stata

I would like to ask if it is possible to create a forest plot for the effect size after running a regression loop for multiple outcomes. I have used the command below, which I found in this question. https://www.statalist.org/forums/forum/general-stata-discussion/general/1701936-extract-coefficient-and-p-value-for-certain-variable-from-regression-loop
frame create results_NProt_Pulse_Velocity
frame results_NProt_Pulse_Velocity {
set obs 1000
gen outcome= ""
gen coef=.
gen SE=.
gen pvalue=.
gen ci_l=.
gen ci_u=.
}
local counter 0
foreach outcome of varlist CHIP-C34_HIV_F {
regress `outcome' PulseWaveVelocity Age sq_age
if r(table)[4, 1] < 0.05{
local ++counter
frame results_NProt_Pulse_Velocity {
replace outcome= "`outcome'" in `counter'
replace coef= `=r(table)[1, 1]' in `counter'
replace SE= `=r(table)[2, 1]' in `counter'
replace pvalue= `=r(table)[4, 1]' in `counter'
replace ci_l= `=r(table)[5, 1]' in `counter'
replace ci_u= `=r(table)[6, 1]' in `counter'
}
}
}
frame change results_NProt_Pulse_Velocity
drop if missing(outcome)
browse
And after this command I got something like this:
Click image for larger version
So how can I create a forest plot after getting this result?
I want the forest plot to include the names of the outcomes and the coefficient value on the y-axis.

Rename variables in SAS using macros

I have a dataset which has last 12 months debit and credit turnover and average balance of the customers. the data looks something like this
| Accountid | Monthly_credit_turover1 || Monthly_credit_turover2 |average_bal_1| average_bal_2
i want to replace the 1,2,3 upto 12 with the last 12 months. for example- average_bal_1 should correspond to average_bal_march and average_bal_2 should get replaced with average_bal_april. so for all the variables with 1 should correspond to march,2 with April and so on
The code that I have written is
%macro renaming(i=, mon=);
data bands_macro;
set may_1;
*this just renames one variable with the two input parameters;
turnover_&i.=sum(MonthlyCreditTurnover&i., MonthlyDebitTurnover&i.);
format tn_bands $50.;
if turnover_&i. le 0 then tn_bands="1. LE 0";
else if turnover_&i. gt 0 and turnover_&i. le 1000 then tn_bands="2. 0-1k";
else if turnover_&i. gt 1000 and turnover_&i. le 4000 then tn_bands="3. 1k-4k";
else if turnover_&i. gt 4000 and turnover_&i. le 10000 then tn_bands="4. 4k-10k";
else tn_bands="5. >10k";
format ab_bands $50.;
if averagebalance&i. =999999999999 or averagebalance&i. le 0 then ab_bands="1.LE 0";
else if averagebalance&i. gt 0 and averagebalance&i. le 1000 then ab_bands="2. 0-1k";
else if averagebalance&i. gt 1000 and averagebalance&i. le 5000 then ab_bands="3. 1k-5k";
else if averagebalance&i. gt 5000 and averagebalance&i. le 10000 then ab_bands="4. 5k-10k";
else if averagebalance&i. gt 10000 and averagebalance&i. le 25000 then ab_bands="5. 10k-25k";
else if averagebalance&i. gt 25000 and averagebalance&i. le 50000 then ab_bands="6. 25k-50k";
else ab_bands="7. >50k";
drop MonthlyCreditTurnover&i. MonthlyDebitTurnover&i.;
run;
%mend;
%renaming(i=1,mon=Mar21);
%renaming(i=2,mon=Feb21);
But unfortunately I am getting this warning when i am running this code-
variable turnover2 cannot be rename as turover_april because turnover_april already exists. How do i make these changes in single dataset
So, this is a pretty good use of a simple in-data step macro.
data mydata;
turnover1 = 1;
turnover2 = 2;
run;
*give two parameters, the numeric value and the month to convert to;
%macro renaming(i=, mon=);
*this just renames one variable with the two input parameters;
rename turnover&i.= turnover_&mon.;
%mend;
data want;
set mydata;
*now call each of the macro iterations separately;
%renaming(i=1,mon=mar)
%renaming(i=2,mon=apr)
; *just to make the highlighting work;
run;
It would be possible to come up with a better list of these, for sure though:
*dataset of the i=mon relationships, you can have this in excel or whatever;
*and change it every month when the other stuff updates;
*you also may be able to "calculate" all of this?;
data renames;
input i mon $;
datalines;
1 mar
2 apr
;;;;
run;
*pull that into a macro variable using PROC SQL SELECT INTO;
proc sql;
select cats('%renaming(i=',i,',mon=',mon,')')
into :renamelist separated by ' '
from renames;
quit;
*now apply the rename list;
data want;
set mydata;
&renamelist.
;
run;

How to import an array of all the non-blank cells of a sheet (for matrix multiplication)?

I've got a spreadsheet for calculating the nutrition data for each meal I eat. One sheet ("ingredients") contains a matrix of nutrition data for each ingredient. Another sheet ("recipes") contains a matrix of servings of each ingredient for each meal. In my "meal nutrition data" sheet, I want to multiply those matrices to get the nutrition data for each recipe.
I can do this for the definite number of ingredients and recipes that I currently have: =mmult(ingredients!B5:AF39,recipes!B2:G32), but if I add recipes or ingredients, I want it to automatically update, so I don't want that formula to be hardcoded. Since matrices need a value in all cells to multiply and since the constraint NxP * PxM (num rows x num columns) must hold, I need to import an array of all of the non-blank cells of each sheet into the MMULT() formula.
My current attempt =mmult(query(ingredients!B5:FN39,"select * where Col1 is not null",0),query(recipes!B2:G32,"select * where Col1 is not null",0)) is not working. It throws the error: "Unable to parse query string for Function QUERY parameter 2: NO_COLUMN: Col1".
I can't figure out what should replace the word "Col1", or if my formula is an efficient way of achieving these results.
I also tried the filter() function but it cannot work with an array.
file.csv
ingredient,"Great Value Golden Sweet Whole Kernel Corn, 15 oz","Great Value White Kidney Beans Cannellini Beans, 15.5 oz","Organic Great Value Garbanzo Beans Chick Peas, 15 oz","Great Value Sweet Peas, 15 oz",Great Value Chunk Light Tuna in Water 12 oz. Can,"Great Value Chunk Light Tuna in Water, 5 oz","Great Value Sardines in Water, 3.75 oz","Great Value 100% Whole Wheat Bread, Round Top, 20 oz","Great Value Wheat Sandwich Bread, 20 oz",Unsalted Roasted Peanuts,"Tyson® Fully Cooked Chicken Patties, 59.2 oz. (Frozen)","Butterball All Natural 90%/10% Lean Ground Turkey, 1 lb.",Sweet Potato,Potato,"Country Crock Original Butter Spread, 15 oz",Great Value Extra Virgin Olive Oil 101 oz,Romaine Lettuce,Great Value Teriyaki Sauce 15 oz,Great Value Soy Sauce 15 oz,Great Value Ketchup 64 oz,Great Value Reduced Fat Mayo with Olive Oil 30 oz,Kroger Ranch Dressin 16 oz,"Great Value Light Creamy Caesar Dressing, 16 oz",Large Egg,"Great Value Diced Tomatoes In Tomato Juice, 28 Oz",Green Bell Pepper,Cucumber,"Great Value Italian Style Bread Crumbs, 15 oz","Great Value Finely Shredded Parmesan Cheese, 6 oz","Great Value Frozen Rising Crust Supreme Pizza, 30.7 oz",Nature Valley Chocolate Pretzel Nut Chewy Granola Bars,Great Value Long Grain Enriched Rice,Broccoli Stir-Fry,Poke Marinade and Sauce ,Great Value Blueberries,Great Value Frozen Mango Chunks,Great Value Frozen Chopped Spinach,Bananas,Blue Diamond Unsweetened Original Almond Milk,Tropicana 100% Orange Juice (some pulp),Kroger 100% Pineapple Juice (unsweetened),Zucchini,Simple Truth Organic Tofu Extra Firm,"Classico Signature Recipes Traditional Basil Pesto Sauce and Spread, 8.1 oz Jar","Frozen Cooked Medium Peeled & Deveined Tail-On Shrimp, 12 oz"
ingredient type,starchy vegetable,bean,bean,vegetable,protein,protein,protein,grain,grain,snack,protein,protein,starchy vegetable,starchy vegetable,fat,fat,vegetable,sauce,sauce,sauce,sauce,sauce,sauce,protein,vegetable,vegetable,vegetable,grain,dairy,meal,snack,carb,vegetable,sauce,fruit,fruit,vegetable,fruit,dairy,fruit,fruit,vegetable,protein,sauce,protein
servings per container,3.5,3.5,3.5,3.5,3,8,1,22,22,6,22,4,1,1,,,,,,,,,,,7,1,1,,,6,,50,7,16,3,10,4,,8,6,8,,0,4,4
serving size,1/2 cup,1/2 cup,1/2 cup,1/2 cup,3 oz,2 oz,1 can,1 slice,1 slice,1 oz,1 patty,4 oz,1 medium potato (114 g),1 medium potato (148 g),1 tbsp,1tbsp,2 cups (94g),1 tbsp,1 tbsp,1 tbsp,1 tbsp,2 tbsp,2 tbsp,1 egg,1/2 cup,1 pepper (100 g),1 cucumber (104g),1/3 cup,1/3 cup,1/6 pizza,1,1/4 cup,1 cup,1 tbsp,1 cup,1 cup,1 cup,1 banana,1 cup,8 fl oz,8 fl oz,1 zucchini,1 pack,1/4 cup,11
calories,45,110,110,60,80,45,100,60,70,170,200,190,103,110,50,120,15,15,5,20,50,140,70,70,25,20,16,110,110,330,150,160,30,50,100,130,35,105,30,110,120,33,270,240,100
total fat (g),0.5,0,2,0,0.5,0.5,4.5,0.5,1,15,13,11,0,0,6,14,0,0,0,0,6,14,6,5,0,0,0,1.5,7,12,5,0,0,0,1.5,0.5,0.5,0.4,2.5,0,0,0.6,13.5,24,1.5
saturated fat (g),0,0,0,0,0,0,1.5,0,0,2,3,2.5,0,0,1.5,2,0,0,0,0,1,2.5,1,1.5,0,0,0,0,4,4.5,2,0,0,0,0,0,0,0.1,0,0,0,0.2,1.5,4.5,0.5
trans fat (g),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.2,0,0,0
polyunsaturated fat (g),0,0,0,0,0,0,1.5,0,0,0,4.5,0,0,0,2.5,0,0,0,0,0,3,0,3.5,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0.1,0.5,0,0,0,0,0,0
monounsaturated fat (g),0,0,0,0,0,0,4,0,0,0,5,0,0,0,1,0,0,0,0,0,1.5,0,1.5,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1.5,0,0,0,0,0,0
cholesterol (mg),0,0,0,0,35,20,75,0,0,0,35,80,0,0,0,0,0,0,0,0,5,5,0,185,0,0,0,0,20,30,0,0,0,0,0,0,0,0,0,0,0,0,0,5,180
sodium (mg),200,270,120,300,270,180,300,110,135,0,400,80,41,0,100,0,10,600,900,160,110,260,560,70,180,3,2,420,340,810,110,0,20,710,0,0,80,1,170,0,35,16,45,590,710
total carbohydrates (g),9,20,18,11,0,0,0,11,13,5,10,0,24,26,0,0,3,3,1,5,1,2,3,0,5,5,4,65,1,42,24,36,7,11,24,29,4,27,1,26,30,6,9,5,2
dietary fiber (g),1,6,4,3,0,0,0,2,1,3,0,0,4,2,0,0,2,2,0,0,0,0,0,0,1,2,2,21,0,2,1,1,2,0,6,3,3,3.1,1,0,1,2,3,1,0
total sugars (g),2,1,3,5,0,0,0,1,2,1,0,0,7,1,0,0,1,2,0,4,0,1,2,0,3,2,2,1,0,4,9,0,2,9,17,27,0,14,0,22,26,4.9,3,0,0
added sugars (g),0,0,0,2,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,3,0,1,2,0,0,0,0,2,0,13,8,0,0,9,0,0,0,0,0,0,0,0,0,0,0
protein (g),1,7,6,3,18,0,16,4,2,8,9,20,2,3,0,0,1,0,1,0,0,0,1,6,1,1,0,4,10,0,2,3,2,1,0,1,3,1.3,1,2,0,2.4,27,3,20
potassium (mg),125,440,210,133,210,110,0,0,30,210,0,0,0,620,0,0,232,0,0,0,0,0,0,70,93,0,152,0,0,0,0,50,200,55,130,340,245,422,160,450,260,0,282,0,0
vitamin A (mg),0,0,0,0,0,0,18,0,0,0,0,0,"3,942",0,90,0,0,0,0,0,0,0,0,80,0,63,0,0,0,72,0,0,0,0,0,0,0,10,150,0,0,63,0,225,36
calcium (mg),14,50,30,17,0,0,325,50,20,30,0,104,52,20,0,0,31,0,0,0,0,0,0,30,0,13,16,78,325,195,0,10,29,0,0,40,124,0,450,20,0,39,390,195,104
vitamin D (mcg),0,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0
vitamin C (mg),0,0,0,0,0,0,0,0,0,0,0,0,11.1,27,0,0,4,0,0,0,0,0,0,0,0,120,0,0,0,9,0,0,0,0,0,101,0,15.3,0,90,72,52,0,7.2,0
iron (mg),0,2.2,1.2,1,1.2,0.6,1.7,0.8,1,1,0.4,1.8,0.7,1.1,0,0,1,0,0,0,0,0,0,0.9,0,0.4,2,1.1,0,2.7,0.9,1.9,1,0,0,0,1.6,0.2,0.7,0,0,0.36,4.32,1.08,0.36
vitamin b-6 (mg),0,0,0,0,0,0,0,0,0,0,0,0,0,0.2,0,0,0.1,0,0,0,0,0,0,0.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0.5,0,0,0,0.9,0,0,0
magnesium (mg),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,32,15,0,0,33.6,0,0,0
thiamine (mg),0,0,0,0,0,0,0,0,0.1,0,0,0,0,0,0,0,0.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.26,0,0,0,0,0,0,0,0,0,0,0,0,0
riboflavin (mg),0,0,0,0,0,0,0,0,0.1,0,0,0,0,0,0,0,0.1,0,0,0,0,0,0,0.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
niacin (mg),0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1.4,0,0,0,0,0,0,0,1.9,0,0,0,0,0,0,0,0,0,0,0,0,0
folate (mcg),0,0,0,0,0,0,0,0,20,0,0,0,0,0,0,0,128,0,0,0,0,0,0,25,0,0,0,0,0,0,0,174,0,0,0,0,0,0,0,0,0,0,0,0,0
vitamin k (mcg),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,96,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
vitamin b-12 (mcg),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
zinc (mg),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
vitamin e (mg),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7.5,0,0,0,0,0,0
biotin (mcg),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,11,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
pantothenic acid (mg),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
phosphorus (mg),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,100,0,0,0,0,0,0,0,0,0,0,0,0,0,0,20,0,0,0,0,0,0
iodine (mcg),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,28,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
selenium (mcg),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,15,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
choline (mg),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,150,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
try:
=ARRAYFORMULA(MMULT(
INDIRECT("ingredients!B5:"&ADDRESS(COUNTA(ingredients!A5:A)+4,
MAX(IF(ingredients!1:1="",,COLUMN(ingredients!1:1))))),
INDIRECT("recipes!B2:"&ADDRESS(COUNTA(recipes!A2:A)+1,
MAX(IF(recipes!1:1="",,COLUMN(recipes!1:1)))))))

Filling in missing values in one dataset based on another in presence of repeated observations in R

Using R, I would like to use information from dataframe 2 to fill in missing values in dataframe 1. Here are the headers from my files. File 1 is a dataframe with data and location (long/lat) of an event. Some of the spatial information is missing.
> head(file1)
day.of.event longitude latitude PLZ
1 01.01.2009 750303 243535 9050
2 01.01.2009 645616 235136 5056
3 01.01.2009 722132 253715 9602
4 01.01.2009 645149 222845 8836
5 01.01.2009 NA NA 3000
6 01.01.2009 NA NA 3000
However, based on the postcode (PLZ) , I can find these in the Swiss official register (cadastre). The NAs in the first file should be replaced by the E/N corresponding to the PLZ (postcode).
> head(file2)
Ortschaftsname PLZ Zusatzziffer Gemeindename Kantonskürzel E N
1 Aadorf 8355 0 Aadorf TG 710450 261277
2 Aarau 5000 0 Aarau AG 646063 248867
3 Aarau 5004 0 Aarau AG 646950 250197
4 Aarau Rohr 5032 0 Aarau AG 648491 250615
5 Aarberg 3270 0 Aarberg BE 588188 210368
6 Aarburg 4663 0 Aarburg AG 635148 241461
Now as I have several hundreds of thousands of events, the postcode will be repeated but I would like to replace all NAs for postcode "3000"(for example) with the same longitude (E) and latitude (N)(repeat for all NAs).
There must be an easier way than doing this manually?
the following is not the best way to do this task, but if the order doesnot matter than you could do something like this.
a<-subset(file1,PLZ==3000) # extract all the rows where PLZ is 3000
b<-subset(file1,PLZ!=3000) # remaining part of dataframe
a$longitude<-rep(lonvalue,nrow(a))
a$latitude<-rep(latvalue,nrow(a))
file1<-rbind(b,a)
in the above code, either hardcode or pass by variable the value of latitude or longitude you want to add
EDIT:
You can write a loop. Iterate over all rows of file1
something like:
for row in row.numbers
{
if is.na(file1$longitude[row])
{
t=subset(file2,PLZ==file1$PLZ[row])
file1$longitude[row]<-t$E
file1$latitude[row]<-t$N
}
}
the above will work if in file2 for each PLZ there is a single row

Gnuplot: import x-axis from file

I have two files 'results.dat' and 'grid.dat'.
The results.dat contains per row a different data set of y values.
1 325.5 875.4 658.7 365.5
2 587.5 987.5 478.6 658.5
3 987.1 542.6 986.2 458.7
The grid.dat contains the corresponding x values.
1 100.0 200.0 300.0 400.0
How can I plot with gnuplot the grid.dat as x values und a specific line of results.dat as corresponding y values? E.g. line 3:
1 100.0 987.1
2 200.0 542.6
3 300.0 986.2
4 400.0 458.7
Thanks in advance.
Thats quite similar to the recent question Gnuplot: plotting the maximum of two files. In your case it is also not possible to do it with gnuplot only.
You need an external tool to combine the two files on-the-fly, e.g. with the following python script (any other tool would also do):
""" selectrow.py: Select a row from 'results.dat' and merge with 'grid.dat'."""
import numpy as np
import sys
line = int(sys.argv[1])
A = np.loadtxt('grid.dat')
B = np.loadtxt('results.dat', skiprows=(line-1))[0]
np.savetxt(sys.stdout, np.c_[A, B], delimiter='\t')
And then plot the third line of results.dat with
plot '< python selectrow.py 3' w l

Resources