Access 2013-Need to concatenate field 1 with unique value, with multiple other fields with duplicate values - concatenation

I have the following 4 records, and need the concatenation to read like the second example
Would someone please show me the formula for the query? Most likely will involve 2 tables.
Field names are OE, Year, Make, Model. Record 1 = OE - 000 353 13 73, Year - 2005, Make - Chevy, Model – Camaro. Record 2 = OE - 000 353 13 73, Year - 2006, Make - Chevy, Model – Camaro. Record 3 = OE - 000 353 13 73, Year - 2006, Make - Buick, Model – Regal.
I want the following results. Record 1 = 2005, 2006, Chevy, Buick, Camaro, Regal
Thank you.

Related

Create a table in Data Studio that shows current/latest value and a previous value and compares them

I have two sheets in a Google Sheet that has historical data for some internal programs.
programs
high level data about each program
each program has data for all of the different report dates
metrics
specific metrics about each program for each report date
Using my example data, there are 4 programs: a, b, c, and d and I have reports for 1/1/2020, 1/15/2020, 2/3/2020, and 6/20/2020.
I want to create a Data Studio report that will:
combine the data on report date and program
show a filter where the user can select which previous report date they want to compare against
the filter should show all of the previous report dates and default select the most recent one
for example, the filter would show:
1/1/2020
1/15/2020
2/3/2020 (default selected)
the filter should only allow selecting one
a table showing values for current report date and values for the report date selected in the above filter
Here is an example table using the source data in my above linked Google Sheet when the report is initially loaded and the filter has 2/3/2020 default selected:
report date
program
id
l1 manager
status
current value
previous report date value
direction
6/20/2020
a
1
Carlos
bad
202
244
up
6/20/2020
b
2
Jack
bad
202
328
up
6/20/2020
c
3
Max
bad
363
249
down
6/20/2020
d
4
Henry
good
267
284
up
If the user selects 1/1/2020 in the filter, then the table would show:
report date
program
id
l1 manager
status
current value
previous report date value
direction
6/20/2020
a
1
Carlos
bad
202
220
up
6/20/2020
b
2
Jack
bad
202
348
up
6/20/2020
c
3
Max
bad
363
266
down
6/20/2020
d
4
Henry
good
267
225
down

Data Studio - Percentage Breakdown from multiple columns

I have a Google Sheet connected to a Google Data Studio file. The data is structured as follows:
ID Stream1 Stream2 Stream3 Total
001 10 5 5 20
002 5 10 15 30
003 100 20 5 125
004 50 0 0 50
Is there any way in Data Studio to produce a percentage breakdown of the Total field based on the various Stream fields? I was thinking of a Tree map chart, so Stream1 should have a box showing 8.8%, Stream2 13.3% etc.
In your DataStudio, Create a new field called "Stream1%". In the definition of that field, code up the Formula as "Stream1/Total". This will give you a new field you can add to your tables and charts. You can now add "Stream1%" into your table and flag it as Numeric > Percentage. You will now have a column the result you want.

BI - fact table design with incompatible grains

I'm quite new to BI designing DB, and here some point I do not understand well.
I'm trying to import french census data, where I got population for each city. For each city, I have population with different age classification, that can't really relate with each other.
For instance, let's say that one classification is 00 to 20 years old, 21 to 59, and 60+
And the other is way more precise : 00 to 02, 03 to 05, etc. but the bounds are never the same as the first one classification : I don't have 15 to 20, but 18 to 22, for example.
So those 2 classifications are incompatible. How can I use them in my fact table ? Should I use 2 fact tables and 2 cubes ? Should I use one fact table, and 2 dimensions for 1 cube ? But in this case, I will have double counted facts when I'll sum to have total population for a city, won't I ?
This is national census data, and national classifications, so changing that or estimating population to mix those classifications is not an option. And to be clear, one row doesn't relate to one person, but to one city. My facts are not individuals but cities' populations.
So this table is like :
Line 1 : One city - one amount of population - one code for dim age (ex. 00 to 19 yo) of this population - code (m/f) for the dim gender of that population - date of the census
Line 2 : Same city - one amount of population - one code for dim age (ex. 20 to 34) of this population - code (m/f) for the dim gender - date of the census
And so it goes for a lot of cities, both gender, and multiple years.
Same
I hope this question is clear enough, as english is not my native language and as I'm quite new in DB and BI !
Thanks for helping me with that.
One possible solution using a single fact table and two dimensions for the age ranges:
1 - Categorical range based on the broadest census, for example:
Young 0-20
Adult 21-59
Senior 60+
You could then link the other census to this dimension with approximate values, for example 18-22 could be Young.
2 -Original age range. This dimension could be used for precise age ranges when you report on a single city, it can also help you evaluate the impact of the overlapping bounds (e.g. how many rows are in the young / 18-22 range?)
you can crate one dimention as below
young 1-20
adult 21-59
senior 60+
Classification is
young city 1 : 1-20
young city 2 : 4-23
id field1 field2 field3 field4 .......
1 1 year young_city_1 other .......
2 2 year young_city_1 other .......
3 3 year young_city_1 other .......
4 4 year young_city_1 young_city_2 .......
Now you can report from any item and with any division
i hope it is help you

How should i format/set up my dataset/dataframe? and factor ->numeric problems

New to R and new to this forum, tried searching, hope i dont embarass myself by failing to identify previous answers.
So i got my data, and i intend to do some kind of glmm's in the end but thats far away in the future, first im going to do some simple glm/lm's to learn what im doing
first about my data:
I have data sampled from 2 "general areas" on opposite sides of the country.
in these general areas there are roughly 50 trakts placed (in a grid, random staring point)
Trakts have been revisited each year for a duration of 4 years
A tract contains 16 sample plots, i intend to work on trakt-level so i use the means of the 16 sample plots for each trakt.
2x4x50 = 400 rows (actual number is 373 rows when i have removed trakts where not enough plots could be sampled due to terrain etc)
the data in my excel file is currently divided like this:
rows = trakts
Columns= the measured variable
i got 8-10 columns i want to use
short example how the data looks now:
V1 - predictor, 4 different columns
V2 - Response variable = proportional data, 1-4 columns depending on which hypothesis i end up testing,
the glmm in the end would look something like, (V2~V1+V1+V1,(area,year))
Area Year Trakt V1 V2
A 2015 1 25.165651 0
A 2015 2 11.16894652 0.1
A 2015 3 18.231 0.16
A 2014 1 3.1222 N/A
A 2014 2 6.1651 0.98
A 2014 3 8.651 1
A 2013 1 6.16416 0.16
B 2015 1 9.12312 0.44
B 2015 2 22.2131 0.17
B 2015 3 12.213 0.76
B 2014 1 1.123132 0.66
B 2014 2 0.000 0.44
B 2014 3 5.213265 0.33
B 2013 1 2.1236 0.268
How should i get started on this?
8 different files?
Nested by trakts ( do i start nesting now or later when i'm doing glmms?)
i load my data into r through the read.tables function
If i run: sapply(dataframe,class)
V1 and V2 are factors, everything else integer
if i run sapply(dataframe,mode)
everything is numeric
so finally to my actual problems, i have been trying to do normality tests (only trid shapiro so far) but i keep getting errors that imply my data is not numeric
also, when i run a normality test, do i only run one column and evaluate it before moving on to the next column or should i run several columns? the entire dataset?
should i in my case run independent normality tests for each of my areas and year?
hope it didnt end up to cluttered
best regards

MS Analysis Cube - one-to-many joins

I am building an OLAP cube in MS SQL Server BI Studio. I have two main tables that contain my measures and dimensions.
One table contains
Date | Keywords | Measure1
where date-keyword is the composite key.
One table contains looks like
Date | Keyword | Product | Measure2 | Measure3
where date-keyword-product is the composite key.
My problem is that there can be a one-to-many relationship between date-keyword's in the first table and date-keyword's in the second table (as the second table has data broken down by product).
I want to be able to make queries that look something like this when filtered for a given Keyword:
Measure1 Measure2 Measure3
============================================================
Tuesday, January 01 2013 23 19 18
============================================================
Bike 23
Car 23 16 13
Motorcycle 23
Caravan 23 2 4
Van 23 1 1
I've created dimensions for the Date and ProductType but I'm having problems creating the dimension for the Keywords. I can create a Keyword dimension that affects the measures from the second table but not the first.
Can anyone point me to any good tutorials for doing this sort of thing?
Turns out the first table had one row with all null values (a weird side effect of uploading an excel file straight into MS SQL Server db). Because the value that the cube was trying to apply the dimension to was null in this one row, the whole cube build and deploy failed with no useful error messages! Grr

Resources