Related
I want to select only the records from table Stock based on the column PostingDate.
The PostingDate should be after the InitDate in another table called InitClient. However, there are currently 2 clients in both tables (client 1 and client 2), that both have a different InitDate.
With the code below I get exactly what I need currently, based on the sample data also included underneath. However, two problems arise, first of all based on millions of records the query is taking way too long (hours). And second of all, it isn't dynamic at all, every time when a new client is included.
A potential option to cover the performance issue would be to write two separate query's, one for Client 1 and one for Client 2 with a UNION in between. Unfortunately, this then isn't dynamic enough since multiple clients are possible.
SELECT
Material
,Stock
,Stock.PostingDate
,Stock.Client
FROM Stock
LEFT JOIN (SELECT InitDate FROM InitClient where Client = 1) C1 ON 1=1
LEFT JOIN (SELECT InitDate FROM InitClient where Client = 2) C2 ON 1=1
WHERE
(
(Stock.Client = 1 AND Stock.PostingDate > C1.InitDate) OR
(Stock.Client = 2 AND Stock.PostingDate > C2.InitDate)
)
Sample dataset:
CREATE TABLE InitClient
(
Client varchar(300),
InitDate date
);
INSERT INTO InitClient (Client,InitDate)
VALUES
('1', '5/1/2021'),
('2', '1/31/2021');
SELECT * FROM InitClient
CREATE TABLE Stock
(
Material varchar(300),
PostingDate varchar(300),
Stock varchar(300),
Client varchar(300)
);
INSERT INTO Stock (Material,PostingDate,Stock,Client)
VALUES
('322', '1/1/2021', '5', '1'),
('101', '2/1/2021', '5', '2'),
('322', '3/2/2021', '10', '1'),
('101', '4/13/2021', '5', '1'),
('400', '5/11/2021', '170', '2'),
('401', '6/20/2021', '200', '1'),
('322', '7/20/2021', '160', '2'),
('400', '8/9/2021', '93', '2');
SELECT * FROM Stock
Desired result, but then with a substitute for the OR statement to ramp up the performance:
| Material | PostingDate | Stock | Client |
|----------|-------------|-------|--------|
| 322 | 1/1/2021 | 5 | 1 |
| 101 | 2/1/2021 | 5 | 2 |
| 322 | 3/2/2021 | 10 | 1 |
| 101 | 4/13/2021 | 5 | 1 |
| 400 | 5/11/2021 | 170 | 2 |
| 401 | 6/20/2021 | 200 | 1 |
| 322 | 7/20/2021 | 160 | 2 |
| 400 | 8/9/2021 | 93 | 2 |
Any suggestions if there is an substitute possible in the above code to keep performance, while making it dynamic?
You can optimize this query quite a bit.
Firstly, those two LEFT JOINs are basically just semi-joins, because you don't actually return any results from them. So we can turn them into a single EXISTS.
You will also get an implicit conversion to int, because Client is varchar and 1,2 is an int. So change that to '1','2', or you could change the column type.
PostingDate is also varchar, that should really be date
SELECT
s.Material
,s.Stock
,s.PostingDate
,s.Client
FROM Stock s
WHERE s.Client IN ('1','2')
AND EXISTS (SELECT 1
FROM InitClient c
WHERE s.PostingDate > c.InitDate
AND c.Client = s.Client
);
Next you want to look at indexing. For this query (not accounting for any other queries being run), you probably want the following indexes (remove the INCLUDE for a clustered index)
InitClient (Client, InitDate)
Stock (Client) INCLUDE (PostingDate, Material, Stock)
It is possible that even with these indexes that you may get a scan on Stock, because IN functions like an OR. This does not always happen, it's worth checking. If so, instead you can rewrite this to use UNION ALL
SELECT
s.Material
,s.Stock
,s.PostingDate
,s.Client
FROM (
SELECT *
FROM Stock s
WHERE s.Client = '1'
UNION ALL
SELECT *
FROM Stock s
WHERE s.Client = '2'
) s
WHERE EXISTS (SELECT 1
FROM InitClient c
WHERE s.PostingDate > c.InitDate
AND c.Client = s.Client
);
db<>fiddle
There is nothing wrong in expecting your query to be dynamic. However, in order to make it more performant, you may need to reach a compromise between to conflicting expectations. I will present here a few ways to optimize your query, some of them involves some drastic changes, but eventually it is you or your client who decides how this needs to be improved. Also, some of the improvements might be ineffective, so do not take anything for granted, test everything. Without further ado, let's see the suggestions
The query
First I would try to change the query a little, maybe something like this could help you
SELECT
Material
,Stock
,Stock.PostingDate
,C1.InitDate
,C2.InitDate
,Stock.Client
FROM Stock
LEFT JOIN InitClient C1 ON Client = 1
LEFT JOIN InitClient C2 ON Client = 2
WHERE
(
(Stock.Client = 1 AND Stock.PostingDate > C1.InitDate) OR
(Stock.Client = 2 AND Stock.PostingDate > C2.InitDate)
)
Sometimes a simple step of getting rid of subselects does the trick
The indexes
You may want to speed up your process by creating indexes, for example on Stock.PostingDate.
Helper table
You can create a helper table where you store the Stock records' relevant data, so you perform the slow query ONCE in a while, maybe once in a week, or each time a new client enters the stage and store the results in the helper table. Once the prerequisite calculation is done, you will be able to query only the helper table with its few records, reaching lightning fast behavior. So, the idea is to execute the slow query rarely, cache/store the results and reuse them instead of calculating it every time.
A new column
You could create a column in your Stock table named InitDate and fill that with data for each record periodically. It will take a long while at the first execution, but then you will be able to query only the Stock table without joins and subselects.
I am using SQL Server 2016. The column in question contains JSON. It always stores data in below format;
{"question1":"123","question2":"123","reference-id":"Z6SIPLGKE56"}
So, multiple rows will have same structure with different values.
Is there a way i can retrieve it back as a table? or put it into a temporary table? So final output will be like;
question1 | question2 | reference-id|....
123 | 123 | Z6SIPLGKE56
456 | 456 | Z6SWFLGKE56
The end result I am looking at is export the results to a CSV. I can do this outside of the SQL Server, but was wondering whether it's possible with built-in features of SQL Server(With current searches I did, seems like the available functions such as openjson etc.. doesn't allow you to do this in one pass).
UPDATE 1 - Since more details are requested by commentros
This is a survey application. So, users can design their own surveys. The structure is stored as json. As a start let's assume each survey has same set of questions. (ex:- Survey 1 has 5 questions where as Survey 2 has 10 questions)
Now, let's say two users fill the survey 1. Sample data if visualized in json is as follows:
from user 1:
{"forms-survey-client-reference-id":"RYRT4ZU1ZO","question1":"ans1","question2":"ans2"....}
from user 2
{"forms-survey-client-reference-id":"RYRT4ZU1FE","question1":"asdf","question2":"dfhdsf"....}
So the CSV output for this survey has to be: (ignore the column order)
question1 | question2 | reference-id|....
asdf | dfhdsf | RYRT4ZU1FE
ans1 | ans2 | RYRT4ZU1ZO
Now consider survey 2 has the following structure after submitting data from:
User 1
{"forms-survey-client-reference-id":"RYRT4ZU1ZO","question1":"ans1","question2":"opt1,opt2,opt3"....}
User 2
{"forms-survey-client-reference-id":"RYRT4ABCZO","question1":"ans1","question2":"opt1,opt2"....}
Notice for question 2, users has selected multiple answers (checkboxes) and they are stored as a general string with comma separated(User 1 has selected 3 items and User 2 has selected 2 items)
The CSV output for above should be:
question1 | question2 | reference-id|....
ans1 | opt1,opt2 | RYRT4ZU1ZO
ans1 | opt1,opt2,opt3 | RYRT4ABCZO
Assuming that this is your JSON structure you can use the following
DECLARE #json NVARCHAR(4000) = '{"question1":"123","question2":"123","reference-id":"Z6SIPLGKE56"}'
SELECT *
FROM
(
SELECT [key] JsonKey , value JsonValue
FROM OPENJSON (#json)
) X
PIVOT
(
MAX(JsonValue) FOR JsonKey IN ([question1], [question2], [reference-id])
) P
If the structure is not going to be similar you'll need to create dynamic pivot
you can also do this:
DECLARE #json NVARCHAR(4000) = '{"question1":"123","question2":"123","reference-id":"Z6SIPLGKE56"}'
SELECT *
FROM OPENJSON (#json)
WITH ([question1] INT '$."question1"',
[question2] INT '$."question2"',
[reference-id] varchar(100) '$."reference-id"')
One method is with OPENJSON and CROSS APPLY:
DECLARE #JsonTable TABLE(json nvarchar(MAX));
INSERT INTO #JsonTable VALUES
(N'{"question1":"123","question2":"123","reference-id":"Z6SIPLGKE56"}')
, (N'{"question1":"456","question2":"456","reference-id":"Z6SIPLGKE57"}');
SELECT
question1
, question1
, reference_id
FROM #JsonTable
CROSS APPLY OPENJSON(json)
WITH (
question1 int '$.question1'
, question2 int '$.question2'
, reference_id varchar(20) '$."reference-id"'
);
I am playing around with a SQLite database in a vb.net application. The database is supposed to store time series data for many variables.
Right now I am trying to build the database with 2 tables as followed:
Table varNames:
CREATE TABLE IF NOT EXISTS varNames(id INTEGER PRIMARY KEY, varName TEXT UNIQUE);
It looks like this:
ID | varName
---------------
1 | var1
2 | var2
... | ...
Table varValues:
CREATE TABLE IF NOT EXISTS varValues(timestamp INTEGER, varValue FLOAT, id INTEGER, FOREIGN KEY(id) REFERENCES varNames(id) ON DELETE CASCADE);
It looks like this:
timestamp | varValue | id
------------------------------
1 | 1.0345 | 1
4 | 3.5643 | 1
1 | 7.7866 | 2
3 | 4.5668 | 2
... | .... | ...
The first table contains all variable names with IDs. The second table contains the values of each variable for many time steps (indicated by the timestamps). A foreign key links the tables through the variable IDs.
Building up the database works fine.
Now I want to query the database and plot the time series for selected variables. For this I use the following statement:
select [timestamp], [varValue] FROM varValues WHERE (SELECT id from varNames WHERE varName= '" & NAMEvariable & "');
Since the user does not know the variabel ID, only the name of the Variable (in NAMEvariable) I use the ..WHERE (SELECT... statement. It seems like this really slows down the performance. The time series have up to 50k points.
Is there any better way to query values for a specific variable which can only be addressed by its name?
You probably should use a join query, something like:
SELECT a.[timestamp], a.varValue
FROM varValues AS a, varNames AS b
WHERE b.varName = <name>
AND a.id = b.ID
edit: To query for more than one parameter, use something like this:
SELECT a.[timestamp], a.varValue
FROM varValues AS a, varNames AS b
WHERE b.varName IN (<name1>, <name2>, ...)
AND a.id = b.ID
ON CONFLICT DO UPDATE/DO NOTHING feature is coming in PostgreSQL 9.5.
Creating Server and FOREIGN TABLE is coming in PostgreSQL 9.2 version.
When I'm using ON CONFLICT DO UPDATE for FOREIGN table it is not working,
but when i'm running same query on normal table it is working.Query is given below.
// For normal table
INSERT INTO app
(app_id,app_name,app_date)
SELECT
p.app_id,
p.app_name,
p.app_date FROM app p
WHERE p.app_id=2422
ON CONFLICT (app_id) DO
UPDATE SET app_date = excluded.app_date ;
O/P : Query returned successfully: one row affected, 5 msec execution time.
// For foreign table concept
// foreign_app is foreign table and app is normal table
INSERT INTO foreign_app
(app_id,app_name,app_date)
SELECT
p.app_id,
p.app_name,
p.app_date FROM app p
WHERE p.app_id=2422
ON CONFLICT (app_id) DO
UPDATE SET app_date = excluded.app_date ;
O/P : ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
Can any one explain why is this happening ?
There are no constraints on foreign tables, because PostgreSQL cannot enforce data integrity on the foreign server – that is done by constraints defined on the foreign server.
To achieve what you want to do, you'll have to stick with the “traditional” way of doing this (e.g. this code sample).
I know this is an old question, but in some cases there is a way to do it with ROW_NUMBER OVER (PARTION). In my case, my first take was to try ON CONFLICT...DO UPDATE, but that doesn't work on foreign tables (as stated above; hence my finding this question). My problem was very specific, in that I had a foreign table (f_zips) to be populated with the best zip code (postal code) information possible. I also had a local table, postcodes, with very good data and another local table, zips, with lower-quality zip code information but much more of it. For every record in postcodes, there is a corresponding record in zips but the postal codes may not match. I wanted f_zips to hold the best data.
I solved this with a union, with a value of ind = 0 as the indicator that a record came from the better data set. A value of ind = 1 indicates lesser-quality data. Then I used row_number() over a partion to get the answer (where get_valid_zip5() is a local function to return either a five-digit zip code or a null value):
insert into f_zips (recnum, postcode)
select s2.recnum, s2.zip5 from (
select s1.recnum, s1.zip5, s1.ind, row_number()
over (partition by recnum order by s1.ind) as rn from (
select recnum, get_valid_zip5(postcode) as zip5, 0 as ind
from postcodes
where get_valid_zip5(postcode) is not null
union
select recnum, get_valid_zip5(zip9) as zip5, 1 as ind
from zips
where get_valid_zip5(zip9) is not null
order by 1, 3) s1
) s2 where s2.rn = 1
;
I haven't run any performance tests, but for me this runs in cron and doesn't directly affect the users.
Verified on more than 900,000 records (SQL formatting omitted for brevity) :
/* yes, the preferred data was entered when it existed in both tables */
select t1.recnum, t1.postcode, t2.zip9 from postcodes t1 join zips t2 on t1.recnum = t2.recnum where t1.postcode is not null and t2.zip9 is not null and t2.zip9 not in ('0') and length(t1.postcode)=5 and length(t2.zip9)=5 and t1.postcode <> t2.zip9 order by 1 limit 5;
recnum | postcode | zip9
----------+----------+-------
12022783 | 98409 | 98984
12022965 | 98226 | 98225
12023113 | 98023 | 98003
select * from f_zips where recnum in (12022783, 12022965, 12023113) order by 1;
recnum | postcode
----------+----------
12022783 | 98409
12022965 | 98226
12023113 | 98023
/* yes, entries came from the less-preferred dataset when they didn't exist in the better one */
select t1.recnum, t1.postcode, t2.zip9 from postcodes t1 right join zips t2 on t1.recnum = t2.recnum where t1.postcode is null and t2.zip9 is not null and t2.zip9 not in ('0') and length(t2.zip9)= 5 order by 1 limit 3;
recnum | postcode | zip9
----------+----------+-------
12021451 | | 98370
12022341 | | 98501
12022695 | | 98597
select * from f_zips where recnum in (12021451, 12022341, 12022695) order by 1;
recnum | postcode
----------+----------
12021451 | 98370
12022341 | 98501
12022695 | 98597
/* yes, entries came from the preferred dataset when the less-preferred one had invalid values */
select t1.recnum, t1.postcode, t2.zip9 from postcodes t1 left join zips t2 on t1.recnum = t2.recnum where t1.postcode is not null and t2.zip9 is null order by 1 limit 3;
recnum | postcode | zip9
----------+----------+------
12393585 | 98118 |
12393757 | 98101 |
12393835 | 98101 |
select * from f_zips where recnum in (12393585, 12393757, 12393835) order by 1;
recnum | postcode
----------+----------
12393585 | 98118
12393757 | 98101
12393835 | 98101
I plan to design a database model for a Business Intelligence system that stores business figures for a set of locations and a set of years.
Some of these figures should be calculated from other figures for the same year and the same location. In the following text I'll call figures that are not being calculated "basic figures". To store the basic figures, a table design with these columns would make sense:
| year | location_id | goods_costs | marketing_costs | warehouse_costs | administrative_costs |
Using this table I could create a view that calculates all other necessary figures:
CREATE VIEW all_figures
SELECT *,
goods_costs + marketing_costs + warehouse_costs + administrative_costs
AS total_costs
FROM basic_figures
This would be great if I didn't run into the following problems:
Most databases (including MySQL which I'm planning to use [edit: but which I'm not bound to]) have some kind of colum count or row size limit. Since I have to store a lot of figures (and have to calculate even more), I'd exceed this limit.
It is not uncommon that new figures have to be added. (Adding a figure would require changes to the table design. And as such changes ususally perform poorly they would block any access to the table for quite a long time.)
I also have to store additional information for each figure, e.g. a description and a unit (all figures are decimal numbers, but some might be in US$/EUR whereas others might be in %). I'd have to make sure that the basic_figures table, the all_figures view and the table containing the figure information are all correctly updated if anything changes. (This is more a data normalization problem than a technical/implementation problem.)
~~
Therefore I considered to use this table design:
+---------+-------------+-------------+-------+
| year | location_id | figure_id | value |
+---------+-------------+-------------+-------+
| 2009 | 1 | goods_costs | 300 |
...
This entity-attribute-value-like design could be a first solution for these three issues. However, it would also have a new downside: Calculations get messy. Really messy.
To build a view similar to the one above, I'd have to use a query like this:
(SELECT * FROM basic_figures_eav)
UNION ALL
(SELECT a.year_id, a.location_id, "total_costs", a.value + b.value + c.value + d.value
FROM basic_figures_eav a
INNER JOIN basic_figures_eav b ON a.year_id = b.year_id AND a.location_id = b.location_id AND b.figure_id = "marketing_costs"
INNER JOIN basic_figures_eav c ON a.year_id = c.year_id AND a.location_id = c.location_id AND c.figure_id = "warehouse_costs"
INNER JOIN basic_figures_eav d ON a.year_id = d.year_id AND a.location_id = d.location_id AND d.figure_id = "administrative_costs"
WHERE a.figure_id = "goods_costs");
Isn't that a beauty? And notice that this is just the query for ONE figure. All other calculated figures (of whom there are many as I wrote above) would also have to UNIONed with this query.
~~
After this long explanation of my problems, I now conculde with my actual questions:
Which database design would you suggest? / Would you use one of the two designs above? (If yes, which and why? If no, why?)
Do you have a suggestion for a completely other approach? (Which I would very, very much appreciate!)
Should the database actually be the one that does the calculations after all? Does it make more sense to move the calculation to the application logic and simply store the results?
By the way: I already asked a similar question on the MySQL forums. However, since answers were a bit sparse and this is not just a MySQL issue after all, I completely rewrote my quesion and posted it here. (So this is not a cross-post.) Here's the link to the thread there: http://forums.mysql.com/read.php?125,560752,560752#msg-560752
The question is (at least somewhat) DBMS specific.
If you can consider other DBMS, you might want to look at PostgreSQL and it's hstore datatype which is essentially a key/value pair.
The downsize of that is, that you lose datatype checking with as everything is stored as a string in the map.
The design that you are aiming at is called "Entity Attribute Value". You might want to find other alternatives as well.
Edit, here is an example on how this could be used:
Table setup
CREATE TABLE basic_figures
(
year_id integer,
location_id integer,
figures hstore
);
insert into basic_figures (year_id, location_id, figures)
values
(1, 1, hstore ('marketing_costs => 200, goods_costs => 100, warehouse_costs => 400')),
(1, 2, hstore ('marketing_costs => 50, goods_costs => 75, warehouse_costs => 250')),
(1, 3, hstore ('adminstrative_costs => 100'));
Basic select
select year_id,
location_id,
to_number(figures -> 'marketing_costs', 'FM999999') as marketing_costs,
to_number(figures -> 'goods_costs', 'FM999999') as goods_costs,
to_number(figures -> 'warehouse_costs', 'FM999999') as warehouse_costs,
to_number(figures -> 'adminstrative_costs', 'FM999999') as adminstrative_costs
from basic_figures bf;
It's probably easier to create a view for that that hides the conversion of the hstore values. The downside of that is, that the view needs to be re-created each time a new cost type is added.
Getting the totals
To get the sum of all costs for each year_id/location_id you can use the following statement:
SELECT year_id,
location_id,
sum(to_number(value, '99999')) as total
FROM (
SELECT year_id,
location_id,
(each(figures)).key,
(each(figures)).value
FROM basic_figures
) AS data
GROUP BY year_id, location_id;
year_id | location_id | total
---------+-------------+-------
1 | 3 | 100
1 | 2 | 375
1 | 1 | 700
That could be joined to the query above, but it's probably faster and easier to use if you create a function that calculates the total for all keys in a single hstore column:
Function to sum the totals
create or replace function sum_hstore(figures hstore)
returns bigint
as
$body$
declare
result bigint;
figure_values text[];
begin
result := 0;
figure_values := avals(figures);
for i in 1..array_length(figure_values, 1) loop
result := result + to_number(figure_values[i], '999999');
end loop;
return result;
end;
$body$
language plpgsql;
That function can easily be used in the first select:
select bf.year_id,
bf.location_id,
to_number(bf.figures -> 'marketing_costs', '99999999') as marketing_costs,
to_number(bf.figures -> 'goods_costs', '99999999') as goods_costs,
to_number(bf.figures -> 'warehouse_costs', '99999999') as warehouse_costs,
to_number(bf.figures -> 'adminstrative_costs', '99999999') as adminstrative_costs,
sum_hstore(bf.figures) as total
from basic_figures bf;
Automatic view creation
The following PL/pgSQL block can be used to (re-)create a view that contains one column for each key in the figures column plus the totals based on the sum_hstore function above:
do
$body$
declare
create_sql text;
types record;
begin
create_sql := 'create or replace view extended_figures as select year_id, location_id ';
for types in SELECT distinct (each(figures)).key as type_name FROM basic_figures loop
create_sql := create_sql || ', to_number(figures -> '''||types.type_name||''', ''9999999'') as '||types.type_name;
end loop;
create_sql := create_sql ||', sum_hstore(figures) as total from basic_figures';
execute create_sql;
end;
$body$
language plpgsql;
After running that function you can simply do a:
select *
from extended_figures
and you'll get as many columns as there are different cost types.
Note that there is no error checking at all if the values in the hstore are actually numbers. That could potentially be done with a trigger.
This is a way to "denormalise" (pivot) an EAV table without needing pivot. Note the left JOIN and the coalesce, which causes non-existant rows to a appear as "zero cost".
NOTE: I had to replace the quoting of the string literals to single quotes.
CREATE TABLE basic_figures_eav
( year_id INTEGER
, location_id INTEGER
, figure_id varchar
, value INTEGER
);
INSERT INTO basic_figures_eav ( year_id , location_id , figure_id , value ) VALUES
(1,1,'goods_costs', 100)
, (1,1,'marketing_costs', 200)
, (1,1,'warehouse_costs', 400)
, (1,1,'administrative_costs', 800)
, (1,2,'goods_costs', 100)
, (1,2,'marketing_costs', 200)
, (1,2,'warehouse_costs', 400)
, (1,3,'administrative_costs', 800)
;
SELECT x.year_id, x.location_id
, COALESCE (a.value,0) AS goods_costs
, COALESCE (b.value,0) AS marketing_costs
, COALESCE (c.value,0) AS warehouse_costs
, COALESCE (d.value,0) AS administrative_costs
--
, COALESCE (a.value,0)
+ COALESCE (b.value,0)
+ COALESCE (c.value,0)
+ COALESCE (d.value,0)
AS total_costs
-- need this to get all the {year_id,location_id} combinations
-- that have at least one tuple in the EAV table
FROM (
SELECT DISTINCT year_id, location_id
FROM basic_figures_eav
-- WHERE <selection of wanted observations>
) AS x
LEFT JOIN basic_figures_eav a ON a.year_id = x.year_id AND a.location_id = x.location_id AND a.figure_id = 'goods_costs'
LEFT JOIN basic_figures_eav b ON b.year_id = x.year_id AND b.location_id = x.location_id AND b.figure_id = 'marketing_costs'
LEFT JOIN basic_figures_eav c ON c.year_id = x.year_id AND c.location_id = x.location_id AND c.figure_id = 'warehouse_costs'
LEFT JOIN basic_figures_eav d ON d.year_id = x.year_id AND d.location_id = x.location_id AND d.figure_id = 'administrative_costs'
;
Result:
CREATE TABLE
INSERT 0 8
year_id | location_id | goods_costs | marketing_costs | warehouse_costs | administrative_costs | total_costs
---------+-------------+-------------+-----------------+-----------------+----------------------+-------------
1 | 3 | 0 | 0 | 0 | 800 | 800
1 | 2 | 100 | 200 | 400 | 0 | 700
1 | 1 | 100 | 200 | 400 | 800 | 1500
(3 rows)
I just want to point out that the second half of your query is needlessly complicated. You can do:
(SELECT a.year_id, a.location_id, "total_costs",
sum(a.value)
FROM basic_figures_eav a
where a.figure_id in ('marketing_costs', 'warehouse_costs', 'administrative_costs',
'goods_costs')
)
Although this uses an aggregation, with a composite index on year_id, location_id, and figure_id, the performance should be similar.
As for the rest of your question, there is a problem with databases limiting the number of columns. I would suggest that you put your base data in a table, with an auto-incremented primary key. Then, create summary tables, linked by the same primary key.
In many environments, you can recreate the summary tables once per day or once per night. If you need real time information, you can use stored procedures/triggers to update the data. That is, when data is updated or inserted, then it can be modified in the summary tables.
Also, I tried to find out if calculated/computed columns in SQL Server count against the maximum number of columns in the table (1,024). I wasn't able to find anything definitive. This is easy enough to test, but I'm not near a database right now.