I'm refactoring very old code. Currently, PHP generates a separate select for every value. Say loc contains 1,2 and data contains a,b, it generates
select val from tablename where loc_id=1 and data_id=a;
select val from tablename where loc_id=1 and data_id=b;
select val from tablename where loc_id=2 and data_id=a;
select val from tablename where loc_id=2 and data_id=b;
...etc which all return either a single value or nothing. That meant I always had n(loc_id)*n(data_id) results, including nulls, which is necessary for subsequent processing. Knowing the order, this was used to generate an HTML table. Both data_id and loc_id can in theory scale up to a couple thousands (which is obviously not great in a table, but that's another concern).
+-----------+-----------+
| data_id 1 | data_id 2 |
+----------+-----------+-----------+
| loc_id 1 | - | 999.99 |
+----------+-----------+-----------+
+ loc_id 2 | 888.88 | - |
+----------+-----------+-----------+
To speed things up, I was looking at replacing this with a single query:
select val from tablename where loc_id in (1,2) and data_id in (a,b) order by loc_id asc, data_id asc;
to get a result like (below) and iterate to build my table.
Rownum VAL
------- --------
1 null
2 999.99
3 777.77
4 null
Unfortunately that approach drops the nulls from the resultset so I end up with
Rownum VAL
------- --------
1 999.99
2 777.77
Note that it is possible that neither data_id or loc_id have any match, in which case I would still need a null, null.
So I don't know which value matches which. I ways to match with the expected loc_id/data_id combination in php if I add loc_id and data_id... but that's getting messy.
Still a novice in SQL in general and that's absolutely the first time I work on PostgreSQL so hopefully that's not too obvious... As I post this I'm looking at two ways to solve this: any in array[] and joins. Will update if anything new is found.
tl;dr question
How do I do a where loc_id in (1,2) and data_id in (a,b) and keep the nulls so that I always get n(loc)*n(data) results?
You can achieve that in a single query with two steps:
Generate a matrix of all desired rows in the output.
LEFT [OUTER] JOIN to actual rows.
You get at least one row for every cell in your table.
If (loc_id, data_id) is unique, you get exactly one row.
SELECT t.val
FROM (VALUES (1), (2)) AS l(loc_id)
CROSS JOIN (VALUES ('a'), ('b')) AS d(data_id) -- generate total grid of rows
LEFT JOIN tablname t USING (loc_id, data_id) -- attach matching rows (if any)
ORDER BY l.loc_id, d.data_id;
Works for any number of columns with any number of values.
For your simple case:
SELECT t.val
FROM (
VALUES
(1, 'a'), (1, 'b')
, (2, 'a'), (2, 'b')
) AS ld (loc_id, data_id) -- total grid of rows
LEFT JOIN tablname t USING (loc_id, data_id) -- attach matching rows (if any)
ORDER BY ld.loc_id, ld.data_id;
where (loc_id in (1,2) or loc_id is null)
and (data_id in (a,b) or data_id is null)
Select the fields you use for filtering, so you know where the values came from:
select loc,data,val from tablename where loc in (1,2) and data in (a,b);
You won't get nulls this way either, but it's not a problem anymore. You know which fields are missing, and you know those are nulls.
Related
Good day.
The question almost the same as topic subject.
So, if I have a query:
select t.* from mytable m,
json_table
(m.json_col,'$.arr[*]'
columns(...)
) t
where m.id = 1
should I bother with order of rows?
TIA,
Andrew.
As with many things to do with databases, there is a difference between the theoretical and the practical and the practical at large scales:
In theory, a result set is an unordered set of rows unless you have specified an ORDER BY clause.
In practice, for small data sets, the query will be handled by a single process and will generate rows in the order the rows are read from the data file and then processed; which means that it will read a row from mytable and then process the JSON data in order and produce the rows in the same order as the array.
In practice at larger scales, the query may be handled by multiple processes on a parallel system (among other factors that may affect the order in which results are generated) where each process reads part of the data set and processes it and then the outputs are combined into a single result set. In this case, there is no guarantee which part of the parallel system will provide the next row and a consistent order cannot be guaranteed.
If you want to guarantee an order then use a FOR ORDINALITY column to capture the array order and then use an ORDER BY clause:
SELECT m.something,
t.*
FROM mytable m
CROSS APPLY JSON_TABLE(
m.json_col,
'$.arr[*]'
COLUMNS(
idx FOR ORDINALITY,
value NUMBER PATH '$'
)
) t
WHERE m.id = 1
ORDER BY m.something, t.idx
Which, for the sample data:
CREATE TABLE mytable (
id NUMBER,
something VARCHAR2(10),
json_col CLOB CHECK(json_col IS JSON)
);
INSERT INTO mytable(id, something, json_col)
SELECT 1, 'AAA', '{"arr":[3,2,1]}' FROM DUAL UNION ALL
SELECT 1, 'BBB', '{"arr":[17,2,42,9]}' FROM DUAL;
Outputs:
SOMETHING
IDX
VALUE
AAA
1
3
AAA
2
2
AAA
3
1
BBB
1
17
BBB
2
2
BBB
3
42
BBB
4
9
db<>fiddle here
db<>fiddle here
I have tried this on the following table,
SELECT DISTINCT
a.main_id,
array_agg(distinct a.secondary_id ) AS arr
FROM table1 a JOIN table1 b ON a.secondary_id = b.secondary_id or a.tertiary_id = b.tertiary_id
group by a.main_id, a.secondary_id , b.tertiary_id
I added the distinct to omit the duplicates But I can not get the whole row as an element in the array which does not even put the rows together to the array based on the below mentioned requirement. I was following this.
Table script:
Create table table1
(
id bigserial NOT NULL,
main_id integer NOT NULL,
secondary_id integer,
tertiary_id integer,
data1 text,
data2 text,
CONSTRAINT table1_pk PRIMARY KEY (main_id)
)
Data:
INSERT INTO table1(
main_id, secondary_id, tertiary_id, data1, data2)
VALUES (1,2,NULL,'data1_1_2_N','data2_1_2_N'),
(2,2,NULL,'data1_2_2_N','data2_2_2_N'),
(3,3,5,'data1_3_3_5','data2_3_3_5'),
(4,3,5,'data1_4_3_5','data2_4_3_5'),
(5,NULL,1,'data1_5_N_1','data2_5_N_1'),
(6,NULL,1,'data1_6_N_1','data2_6_N_1'),
(7,NULL,1,'data1_7_N_1','data2_7_N_1'),
(8,NULL,2,'data1_8_N_2','data2_8_N_2'),
(9,NULL,2,'data1_9_N_2','data2_9_N_2'),
(10,NULL,3,'data1_10_N_3','data2_10_N_3'),
(11,12,12,'data1_11_12_12','data2_11_12_12'),
(12,12,11,'data1_12_12_11','data2_12_12_11')
Requirement:
If secondary_id is equal in two or more rows they should be considered as one set,
else if tertiary_id is equal they can be considered as one set.
Expected Result:
1 | {(1,2,NULL,'data1_1_2_N','data2_1_2_N'),(2,2,NULL,'data1_2_2_N','data2_2_2_N')}
2 | {(3,3,NULL,'data1_3_3_N','data2_3_3_N'),(4,3,NULL,'data1_4_3_N','data2_4_3_N')}
3 | {(5,NULL,1,'data1_5_N_1','data2_5_N_1'),(6,NULL,1,'data1_6_N_1','data2_6_N_1'),(7,NULL,1,'data1_7_N_1','data2_7_N_1')}
4 | {(8,NULL,2,'data1_8_N_2','data2_8_N_2'),(9,NULL,2,'data1_9_N_2','data2_9_N_2')}
5 | {(10,NULL,3,'data1_10_N_3','data2_10_N_3')}
6 | {(11,12,12,'data1_11_12_12','data2_11_12_12'),(12,12,11,'data1_12_12_11','data2_12_12_11') }
Version "PostgreSQL 9.3.11"
This should achieve your output. The trick sticks within conditional group by clause to handle cases where secondary_id and tertiary_id are the same for a record which has a matching record on both of those fields.
select array_agg(distinct t1)
from table1 t1
join table1 t2 on
t1.secondary_id = t2.secondary_id
or t1.tertiary_id = t2.tertiary_id
group by
case
when t1.secondary_id is null or t1.secondary_id is null
then concat(t1.secondary_id,'#',t1.tertiary_id) -- #1
when t1.secondary_id is not null and t1.tertiary_id is not null and t1.secondary_id = t2.secondary_id
then t1.secondary_id::TEXT -- #2
when t1.secondary_id is not null and t1.tertiary_id is not null and t1.tertiary_id = t2.tertiary_id
then t1.tertiary_id::TEXT -- #3
end
order by 1
Standard case is when any of the fields are null, which stands for #1. We need to group by both columns and we're tricking it by concatenating both values from columns with a # mark and doing a group by this concatenated column.
For #2 and #3 we need to cast the grouping value to type text to make it go through (types returned by CASE statement need to be the same).
Option #2 serves the case when both values are not null and secondary_id matches between those "chosen" rows from selfjoin. Option #3 is analogical, but for tertiary_id match.
Output:
array_agg
------------------------------------------------------------------------------------------------------------
{"(1,1,2,,data1_1_2_N,data2_1_2_N)","(2,2,2,,data1_2_2_N,data2_2_2_N)"}
{"(3,3,3,5,data1_3_3_5,data2_3_3_5)","(4,4,3,5,data1_4_3_5,data2_4_3_5)"}
{"(5,5,,1,data1_5_N_1,data2_5_N_1)","(6,6,,1,data1_6_N_1,data2_6_N_1)","(7,7,,1,data1_7_N_1,data2_7_N_1)"}
{"(8,8,,2,data1_8_N_2,data2_8_N_2)","(9,9,,2,data1_9_N_2,data2_9_N_2)"}
{"(10,10,,3,data1_10_N_3,data2_10_N_3)"}
{"(11,11,4,4,data1_11_4_4,data2_11_4_4)","(12,12,4,11,data1_12_4_11,data2_12_4_11)"}
If you'd like to get rid of column id from your record, you could use a CTE and select all columns but id and then refer to that CTE in from clause.
This is my first post - so I apologise if it's in the wrong seciton!
I'm joining two tables with a one-to-many relationship using their respective ID numbers: but I only want to return the most recent record for the joined table and I'm not entirely sure where to even start!
My original code for returning everything is shown below:
SELECT table_DATES.[date-ID], *
FROM table_CORE LEFT JOIN table_DATES ON [table_CORE].[core-ID] = table_DATES.[date-ID]
WHERE table_CORE.[core-ID] Like '*'
ORDER BY [table_CORE].[core-ID], [table_DATES].[iteration];
This returns a group of records: showing every matching ID between table_CORE and table_DATES:
table_CORE date-ID iteration
1 1 1
1 1 2
1 1 3
2 2 1
2 2 2
3 3 1
4 4 1
But I need to return only the date with the maximum value in the "iteration" field as shown below
table_CORE date-ID iteration Additional data
1 1 3 MoreInfo
2 2 2 MoreInfo
3 3 1 MoreInfo
4 4 1 MoreInfo
I really don't even know where to start - obviously it's going to be a JOIN query of some sort - but I'm not sure how to get the subquery to return only the highest iteration for each item in table 2's ID field?
Hope that makes sense - I'll reword if it comes to it!
--edit--
I'm wondering how to integrate that when I'm needing all the fields from table 1 (table_CORE in this case) and all the fields from table2 (table_DATES) joined as well?
Both tables have additional fields that will need to be merged.
I'm pretty sure I can just add the fields into the "SELECT" and "GROUP BY" clauses, but there are around 40 fields altogether (and typing all of them will be tedious!)
Try using the MAX aggregate function like this with a GROUP BY clause.
SELECT
[ID1],
[ID2],
MAX([iteration])
FROM
table_CORE
LEFT JOIN table_DATES
ON [table_CORE].[core-ID] = table_DATES.[date-ID]
WHERE
table_CORE.[core-ID] Like '*' --LIKE '%something%' ??
GROUP BY
[ID1],
[ID2]
Your example field names don't match your sample query so I'm guessing a little bit.
Just to make sure that I have everything you’re asking for right, I am going to restate some of your question and then answer it.
Your source tables look like this:
table_core:
table_dates:
And your outputs are like this:
Current:
Desired:
In order to make that happen all you need to do is use a subquery (or a CTE) as a “cross-reference” table. (I used temp tables to recreate your data example and _ in place of the - in your column names).
--Loading the example data
create table #table_core
(
core_id int not null
)
create table #table_dates
(
date_id int not null
, iteration int not null
, additional_data varchar(25) null
)
insert into #table_core values (1), (2), (3), (4)
insert into #table_dates values (1,1, 'More Info 1'),(1,2, 'More Info 2'),(1,3, 'More Info 3'),(2,1, 'More Info 4'),(2,2, 'More Info 5'),(3,1, 'More Info 6'),(4,1, 'More Info 7')
--select query needed for desired output (using a CTE)
; with iter_max as
(
select td.date_id
, max(td.iteration) as iteration_max
from #table_dates as td
group by td.date_id
)
select tc.*
, td.*
from #table_core as tc
left join iter_max as im on tc.core_id = im.date_id
inner join #table_dates as td on im.date_id = td.date_id
and im.iteration_max = td.iteration
select *
from
(
SELECT table_DATES.[date-ID], *
, row_number() over (partition by table_CORE date-ID order by iteration desc) as rn
FROM table_CORE
LEFT JOIN table_DATES
ON [table_CORE].[core-ID] = table_DATES.[date-ID]
WHERE table_CORE.[core-ID] Like '*'
) tt
where tt.rn = 1
ORDER BY [core-ID]
I have two tables, X and Y, with identical schema but different records. Given a record from X, I need a query to find the closest matching record in Y that contains NULL values for non-matching columns. Identity columns should be excluded from the comparison. For example, if my record looked like this:
------------------------
id | col1 | col2 | col3
------------------------
0 |'abc' |'def' | 'ghi'
And table Y looked like this:
------------------------
id | col1 | col2 | col3
------------------------
6 |'abc' |'def' | 'zzz'
8 | NULL |'def' | NULL
Then the closest match would be record 8, since where the columns don't match, there are NULL values. 6 WOULD have been the closest match, but the 'zzz' disqualified it.
What's unique about this problem is that the schema of the tables is unknown besides the id column and the data types. There could be 4 columns, or there could be 7 columns. We just don't know - it's dynamic. All we know is that there is going to be an 'id' column and that the columns will be strings, either varchar or nvarchar.
What is the best query in this case to pick the closest matching record out of Y, given a record from X? I'm actually writing a function. The input is an integer (the id of a record in X) and the output is an integer (the id of a record in Y, or NULL). I'm an SQL novice, so a brief explanation of what's happening in your solution would help me greatly.
There could be 4 columns, or there could be 7 columns.... I'm actually writing a function.
This is an impossible task. Because functions are deterministic, so you cannot have a function that will work on an arbitrary table structure, using dynamic SQL. A stored procedure, sure, but not a function.
However, the below shows you a way using FOR XML and some decomposing of the XML to unpivot rows into column names and values which can then be compared. The technique used here and the queries can be incorporated into a stored procedure.
MS SQL Server 2008 Schema Setup:
-- this is the data table to match against
create table t1 (
id int,
col1 varchar(10),
col2 varchar(20),
col3 nvarchar(40));
insert t1
select 6, 'abc', 'def', 'zzz' union all
select 8, null , 'def', null;
-- this is the data with the row you want to match
create table t2 (
id int,
col1 varchar(10),
col2 varchar(20),
col3 nvarchar(40));
insert t2
select 0, 'abc', 'def', 'ghi';
GO
Query 1:
;with unpivoted1 as (
select n.n.value('local-name(.)','nvarchar(max)') colname,
n.n.value('.','nvarchar(max)') value
from (select (select * from t2 where id=0 for xml path(''), type)) x(xml)
cross apply x.xml.nodes('//*[local-name()!="id"]') n(n)
), unpivoted2 as (
select x.id,
n.n.value('local-name(.)','nvarchar(max)') colname,
n.n.value('.','nvarchar(max)') value
from (select id,(select * from t1 where id=outr.id for xml path(''), type) from t1 outr) x(id,xml)
cross apply x.xml.nodes('//*[local-name()!="id"]') n(n)
)
select TOP(1) WITH TIES
B.id,
sum(case when A.value=B.value then 1 else 0 end) matches
from unpivoted1 A
join unpivoted2 B on A.colname = B.colname
group by B.id
having max(case when A.value <> B.value then 1 end) is null
ORDER BY matches;
Results:
| ID | MATCHES |
----------------
| 8 | 1 |
I plan to design a database model for a Business Intelligence system that stores business figures for a set of locations and a set of years.
Some of these figures should be calculated from other figures for the same year and the same location. In the following text I'll call figures that are not being calculated "basic figures". To store the basic figures, a table design with these columns would make sense:
| year | location_id | goods_costs | marketing_costs | warehouse_costs | administrative_costs |
Using this table I could create a view that calculates all other necessary figures:
CREATE VIEW all_figures
SELECT *,
goods_costs + marketing_costs + warehouse_costs + administrative_costs
AS total_costs
FROM basic_figures
This would be great if I didn't run into the following problems:
Most databases (including MySQL which I'm planning to use [edit: but which I'm not bound to]) have some kind of colum count or row size limit. Since I have to store a lot of figures (and have to calculate even more), I'd exceed this limit.
It is not uncommon that new figures have to be added. (Adding a figure would require changes to the table design. And as such changes ususally perform poorly they would block any access to the table for quite a long time.)
I also have to store additional information for each figure, e.g. a description and a unit (all figures are decimal numbers, but some might be in US$/EUR whereas others might be in %). I'd have to make sure that the basic_figures table, the all_figures view and the table containing the figure information are all correctly updated if anything changes. (This is more a data normalization problem than a technical/implementation problem.)
~~
Therefore I considered to use this table design:
+---------+-------------+-------------+-------+
| year | location_id | figure_id | value |
+---------+-------------+-------------+-------+
| 2009 | 1 | goods_costs | 300 |
...
This entity-attribute-value-like design could be a first solution for these three issues. However, it would also have a new downside: Calculations get messy. Really messy.
To build a view similar to the one above, I'd have to use a query like this:
(SELECT * FROM basic_figures_eav)
UNION ALL
(SELECT a.year_id, a.location_id, "total_costs", a.value + b.value + c.value + d.value
FROM basic_figures_eav a
INNER JOIN basic_figures_eav b ON a.year_id = b.year_id AND a.location_id = b.location_id AND b.figure_id = "marketing_costs"
INNER JOIN basic_figures_eav c ON a.year_id = c.year_id AND a.location_id = c.location_id AND c.figure_id = "warehouse_costs"
INNER JOIN basic_figures_eav d ON a.year_id = d.year_id AND a.location_id = d.location_id AND d.figure_id = "administrative_costs"
WHERE a.figure_id = "goods_costs");
Isn't that a beauty? And notice that this is just the query for ONE figure. All other calculated figures (of whom there are many as I wrote above) would also have to UNIONed with this query.
~~
After this long explanation of my problems, I now conculde with my actual questions:
Which database design would you suggest? / Would you use one of the two designs above? (If yes, which and why? If no, why?)
Do you have a suggestion for a completely other approach? (Which I would very, very much appreciate!)
Should the database actually be the one that does the calculations after all? Does it make more sense to move the calculation to the application logic and simply store the results?
By the way: I already asked a similar question on the MySQL forums. However, since answers were a bit sparse and this is not just a MySQL issue after all, I completely rewrote my quesion and posted it here. (So this is not a cross-post.) Here's the link to the thread there: http://forums.mysql.com/read.php?125,560752,560752#msg-560752
The question is (at least somewhat) DBMS specific.
If you can consider other DBMS, you might want to look at PostgreSQL and it's hstore datatype which is essentially a key/value pair.
The downsize of that is, that you lose datatype checking with as everything is stored as a string in the map.
The design that you are aiming at is called "Entity Attribute Value". You might want to find other alternatives as well.
Edit, here is an example on how this could be used:
Table setup
CREATE TABLE basic_figures
(
year_id integer,
location_id integer,
figures hstore
);
insert into basic_figures (year_id, location_id, figures)
values
(1, 1, hstore ('marketing_costs => 200, goods_costs => 100, warehouse_costs => 400')),
(1, 2, hstore ('marketing_costs => 50, goods_costs => 75, warehouse_costs => 250')),
(1, 3, hstore ('adminstrative_costs => 100'));
Basic select
select year_id,
location_id,
to_number(figures -> 'marketing_costs', 'FM999999') as marketing_costs,
to_number(figures -> 'goods_costs', 'FM999999') as goods_costs,
to_number(figures -> 'warehouse_costs', 'FM999999') as warehouse_costs,
to_number(figures -> 'adminstrative_costs', 'FM999999') as adminstrative_costs
from basic_figures bf;
It's probably easier to create a view for that that hides the conversion of the hstore values. The downside of that is, that the view needs to be re-created each time a new cost type is added.
Getting the totals
To get the sum of all costs for each year_id/location_id you can use the following statement:
SELECT year_id,
location_id,
sum(to_number(value, '99999')) as total
FROM (
SELECT year_id,
location_id,
(each(figures)).key,
(each(figures)).value
FROM basic_figures
) AS data
GROUP BY year_id, location_id;
year_id | location_id | total
---------+-------------+-------
1 | 3 | 100
1 | 2 | 375
1 | 1 | 700
That could be joined to the query above, but it's probably faster and easier to use if you create a function that calculates the total for all keys in a single hstore column:
Function to sum the totals
create or replace function sum_hstore(figures hstore)
returns bigint
as
$body$
declare
result bigint;
figure_values text[];
begin
result := 0;
figure_values := avals(figures);
for i in 1..array_length(figure_values, 1) loop
result := result + to_number(figure_values[i], '999999');
end loop;
return result;
end;
$body$
language plpgsql;
That function can easily be used in the first select:
select bf.year_id,
bf.location_id,
to_number(bf.figures -> 'marketing_costs', '99999999') as marketing_costs,
to_number(bf.figures -> 'goods_costs', '99999999') as goods_costs,
to_number(bf.figures -> 'warehouse_costs', '99999999') as warehouse_costs,
to_number(bf.figures -> 'adminstrative_costs', '99999999') as adminstrative_costs,
sum_hstore(bf.figures) as total
from basic_figures bf;
Automatic view creation
The following PL/pgSQL block can be used to (re-)create a view that contains one column for each key in the figures column plus the totals based on the sum_hstore function above:
do
$body$
declare
create_sql text;
types record;
begin
create_sql := 'create or replace view extended_figures as select year_id, location_id ';
for types in SELECT distinct (each(figures)).key as type_name FROM basic_figures loop
create_sql := create_sql || ', to_number(figures -> '''||types.type_name||''', ''9999999'') as '||types.type_name;
end loop;
create_sql := create_sql ||', sum_hstore(figures) as total from basic_figures';
execute create_sql;
end;
$body$
language plpgsql;
After running that function you can simply do a:
select *
from extended_figures
and you'll get as many columns as there are different cost types.
Note that there is no error checking at all if the values in the hstore are actually numbers. That could potentially be done with a trigger.
This is a way to "denormalise" (pivot) an EAV table without needing pivot. Note the left JOIN and the coalesce, which causes non-existant rows to a appear as "zero cost".
NOTE: I had to replace the quoting of the string literals to single quotes.
CREATE TABLE basic_figures_eav
( year_id INTEGER
, location_id INTEGER
, figure_id varchar
, value INTEGER
);
INSERT INTO basic_figures_eav ( year_id , location_id , figure_id , value ) VALUES
(1,1,'goods_costs', 100)
, (1,1,'marketing_costs', 200)
, (1,1,'warehouse_costs', 400)
, (1,1,'administrative_costs', 800)
, (1,2,'goods_costs', 100)
, (1,2,'marketing_costs', 200)
, (1,2,'warehouse_costs', 400)
, (1,3,'administrative_costs', 800)
;
SELECT x.year_id, x.location_id
, COALESCE (a.value,0) AS goods_costs
, COALESCE (b.value,0) AS marketing_costs
, COALESCE (c.value,0) AS warehouse_costs
, COALESCE (d.value,0) AS administrative_costs
--
, COALESCE (a.value,0)
+ COALESCE (b.value,0)
+ COALESCE (c.value,0)
+ COALESCE (d.value,0)
AS total_costs
-- need this to get all the {year_id,location_id} combinations
-- that have at least one tuple in the EAV table
FROM (
SELECT DISTINCT year_id, location_id
FROM basic_figures_eav
-- WHERE <selection of wanted observations>
) AS x
LEFT JOIN basic_figures_eav a ON a.year_id = x.year_id AND a.location_id = x.location_id AND a.figure_id = 'goods_costs'
LEFT JOIN basic_figures_eav b ON b.year_id = x.year_id AND b.location_id = x.location_id AND b.figure_id = 'marketing_costs'
LEFT JOIN basic_figures_eav c ON c.year_id = x.year_id AND c.location_id = x.location_id AND c.figure_id = 'warehouse_costs'
LEFT JOIN basic_figures_eav d ON d.year_id = x.year_id AND d.location_id = x.location_id AND d.figure_id = 'administrative_costs'
;
Result:
CREATE TABLE
INSERT 0 8
year_id | location_id | goods_costs | marketing_costs | warehouse_costs | administrative_costs | total_costs
---------+-------------+-------------+-----------------+-----------------+----------------------+-------------
1 | 3 | 0 | 0 | 0 | 800 | 800
1 | 2 | 100 | 200 | 400 | 0 | 700
1 | 1 | 100 | 200 | 400 | 800 | 1500
(3 rows)
I just want to point out that the second half of your query is needlessly complicated. You can do:
(SELECT a.year_id, a.location_id, "total_costs",
sum(a.value)
FROM basic_figures_eav a
where a.figure_id in ('marketing_costs', 'warehouse_costs', 'administrative_costs',
'goods_costs')
)
Although this uses an aggregation, with a composite index on year_id, location_id, and figure_id, the performance should be similar.
As for the rest of your question, there is a problem with databases limiting the number of columns. I would suggest that you put your base data in a table, with an auto-incremented primary key. Then, create summary tables, linked by the same primary key.
In many environments, you can recreate the summary tables once per day or once per night. If you need real time information, you can use stored procedures/triggers to update the data. That is, when data is updated or inserted, then it can be modified in the summary tables.
Also, I tried to find out if calculated/computed columns in SQL Server count against the maximum number of columns in the table (1,024). I wasn't able to find anything definitive. This is easy enough to test, but I'm not near a database right now.