SQLite database structure for time series

SQLite database structure for time series - database

My VB.NET application produces simulation data which I want to store in an SQLite Database. The data consists of hundreds of variables which have values for up to 50k time steps (occurrences/measurements). Number of variables is variable. Time steps can vary from 10 to 50k.
So far, I have one table. The first column contains the timestamp (primary key) and the following contain variable values for each variable (column name = variable name). The rows are filled with the timestamps and the variable values for each time step:
timestamp | var1 | var2 | var3 | ...
----------------------------------------------
1 | var1(1) | var2(1) | var3(1) | ...
2 | var1(2) | var2(2) | var3(2) | ...
3 | var1(3) | var2(3) | var3(3) | ...
... | ... | ... | ... | ...
I use:
CREATE TABLE variables(timestamp INTEGER PRIMARY KEY, var1 REAL, var2 REAL, ...);
This works. I use the database to save the simulation data for later evaluation. I need to plot selected time series and copy the values of some variables for specific time spans to Excel (calculate sums, maxima, etc.).
I've read not to add too many columns (I may have more than 500 variables/columns). Regarding performance, is it better to structure it differently? For example one table with four columns: ID (primary key), timestamp, variable name and variable value.
ID | timestamp | varName | varValue
------------------------------------
1 | 1 | var1 | var1(1)
2 | 2 | var1 | var1(2)
...| ... | ... | ...
50 | 50 | var1 | var1(50)
51 | 1 | var2 | var2(1)
52 | 2 | var2 | var2(2)
...| ... | ... | ...
In this case I would have 50k time steps * 500 variables = 25 million rows, but a fixed number of columns. Is there any better way?
What happens regarding performance (for read queries) if I have inserted the rows not in ascending timestamp order?

Related

Create/Update table in MS Access dynamically

EDIT:
Here's what I have: An Access database made up of 3 tables linked from SQL server. I need to create a new table in this database by querying the 3 source tables. Here are examples of the 3 tables I'm using:
PlanTable1
+------+------+------+------+---------+---------+
| Key1 | Key2 | Key3 | Key4 | PName | MainKey |
+------+------+------+------+---------+---------+
| 53 | 1 | 5 | -1 | Bikes | 536681 |
| 53 | 99 | -1 | -1 | Drinks | 536682 |
| 53 | 66 | 68 | -1 | Balls | 536683 |
+------+------+------+------+---------+---------+
SpTable
+----+---------+---------+
| ID | MainKey | SpName |
+----+---------+---------+
| 10 | 536681 | Wing1 |
| 11 | 536682 | Wing2 |
| 12 | 536683 | Wing3 |
+----+---------+---------+
LocTable
+-------+-------------+--------------+
| LocID | CenterState | CenterCity |
+--- ---+-------------+--------------+
| 10 | IN | Indianapolis |
| 11 | OH | Columbus |
| 12 | IL | Chicago |
+-------+-------------+--------------+
You can see the relationships between the tables. The NewMasterTable I need to create based off of these will look something like this:
NewMasterTable
+-------+--------+-------------+------+--------------+-------+-------+-------+
| LocID | PName | CenterState | Key4 | CenterCity | Wing1 | Wing2 | Wing3 |
+-------+--------+-------------+------+--------------+-------+-------+-------+
| 10 | Bikes | IN | -1 | Indianapolis | 1 | 0 | 0 |
| 11 | Drinks | OH | -1 | Columbus | 0 | 1 | 0 |
| 12 | Balls | IL | -1 | Chicago | 0 | 0 | 1 |
+-------+--------+-------------+------+--------------+-------+-------+-------+
The hard part for me is making this new table dynamic. In the future, rows may be added to the source tables. I need my NewMasterTable to reflect any changes/additions to the source. How do I go about building the NewMasterTable as described? Does this make any sort of sense?

Since an Access table is a necessary requirement, then probably the only way to go about it is to create a set of Update and Insert queries that are executed periodically. There is no built-in "dynamic" feature of Access that will monitor and update the table.
First, create the table. You could either 1) do this manually from scratch by defining the columns and constraints yourself, or 2) create a make-table query (i.e. SELECT... INTO) that generates most of the schema, then add any additional columns, edit necessary details and add appropriate indexes.
Define and save Update and Insert (and optional Delete) queries to keep the table synced. I'm not sharing actual code here, because that goes beyond your primary issue I think and requires specifics that you need to define. Due to some ambiguity with your key values (the field names and sample data still are not sufficient to reveal precise relationships and constraints), it is likely that you'll need multiple Update statements.
In particular, the "Wing" columns will likely require a transform statement.
You may not be able to update all columns appropriately using a single query. I recommend not trying to force such an "artificial" requirement. Multiple queries can actually be easier to understand and maintain.
In the event that you experience "query is not updateable" errors, you may need to define other "temporary" tables with appropriate indexes, into which you do initial inserts from the linked tables, then subsequent queries to update your master table from those.
Finally, and I think this is the key to solving your problem, you need to define some Access form (or other code) that periodically runs your set of "sync" queries. Access forms have a [Timer Interval] property and corresponding Timer event that fires periodically. Add VBA code in the Form_Timer sub that runs all your queries. I would suggest "wrapping" such VBA in a transaction and adding appropriate error handling and error logging, etc.

Making an Object Dependent Number of Fields for a Table in MS.Access

I'm trying to make a database that will hold a table of objects, and these objects are comprised of objects from a second table. One table is a table of possible sets, and the second is a table of possible components. The table of sets has to include fields for each of its components, but each set has an unknown number of components. How do I make a table with fields (Component 1, Component 2, Component 3, ...) that are dependent on each set to decide how many of the fields it needs?
Is there a way to do this just using the Access interface or will I actually have to get into the code behind it?
I think it would also solve my problem if there were a way to make a field in a column that acted as an ArrayList so if anyone could think of how to do that please let me know.

Assuming that a component can be part of more than one set, what you need here is a many-to-many relationship.
In a database you don't do this with an arbitrary number of columns, you use a junction table.
When you need a tabular representation, you use a Pivot / Crosstab query.
Your data model could look like this:
Sets
+--------+----------+
| Set_ID | Set_Name |
+--------+----------+
| 1 | foo |
| 2 | bar |
+--------+----------+
Components
+--------------+----------------+
| Component_ID | Component_Name |
+--------------+----------------+
| 1 | aaa |
| 2 | bbb |
| 3 | ccc |
| 4 | ddd |
+--------------+----------------+
Junction table
+----------+----------------+
| f_Set_ID | f_Component_ID |
+----------+----------------+
| 1 | 2 |
| 1 | 4 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
+----------+----------------+
(f_ as in Foreign Key)

Django QuerySet - Multiple Order By when there is a single field per record

I am accessing an API where the number of fields can change at any time, but I must store and display the data as a table. Therefore, each record from the API is stored as a single record per field. My problem is that I am having trouble working out how I would order by multiple columns at a time. Putting all of the data into a 2D array (list of lists) before sorting is not a viable option as the number of records could be too large to feasibly hold in memory.
I've put together a simple example to explain. If anyone has an idea on how to overcome the problem, or how I could redesign my approach, I'd be very grateful.
| record_id | field | data |
| 1 | x | 2 |
| 1 | y | 1 |
| 1 | z | 3 |
| 2 | x | 30 |
| 2 | y | 42 |
| 2 | z | 7 |
| 3 | x | 53 |
| 3 | y | 2 |
| 3 | z | 7 |
If ordering by fields 'z' then 'x' (both ascending), the record order would be 1,2,3
If ordering by fields 'z' then 'y' (both ascending), the record order would be 1,3,2
I am using models in django to store and I am using QuerySets to retrieve the data. I don't have any control over the API or database from which I am originally accessing the data.

After a fair amount of research I realised I was going about this all wrong - I am now using an hstore field in postgres and django-hstore to utilise it, for a schema-less approach. I now have a single row per original record and I can order_by after casting the required field in an 'extra' method.

Conditional SUM using multiple tables in EXCEL

I have a table that I'm trying to populate based on the values of two reference tables.
I have various different projects 'Type 1', 'Type 2' etc. that each run for 4 months and cost different amounts depending on when in their life cycle they are. These costings are shown in Ref Table 1.
Ref Table 1
Month | a | b | c | d
---------------------------------
Type 1 | 1 | 2 | 3 | 4
Type 2 | 10 | 20 | 30 | 40
Type 3 | 100 | 200 | 300 | 400
Ref Table 2 shows my schedule of projects for the next 3 months. With 2 new ones starting in Jan, one being a Type 1 and the other being a Type 2. In Feb, I'll have 4 projects, the first two entering their second month and two new ones start, but this time a Type 1 and a Type 3.
Ref table 2
Date | Jan | Feb | Mar
--------------------------
Type 1 | a | b | c
Type 1 | | a | b
Type 2 | a | b | c
Type 2 | | | a
Type 3 | | a | b
I'd like to create a table which calculates the total costs spent per project type each month. Example results are shown below in Results table.
Results
Date | Jan | Feb | Mar
-------------------------------
Type 1 | 1 | 3 | 5
Type 2 | 10 | 20 | 40
Type 3 | 0 | 100 | 200
I tried doing it with an array formula:
Res!b2 = {sum(if((Res!A2 = Ref2!A2:A6) * (Res!A2 = Ref1!A2:A4) * (Ref2!B2:D6 = Ref1!B1:D1), Ref!B2:E4))}
However it doesn't work and I believe that it's because of the third condition trying to compare a vector with another vector rather than a single value.
Does anyone have any idea how I can do this? Happy to use arrays, index, match, vector, lookups but NOT VBA.
Thanks

Assuming that months in results table headers are in the same order as Ref table 2 (as per your example) then try this formula in Res!B2
=SUM(SUMIF(Ref1!$B$1:$E$1,IF(Ref2!$A$2:$A$6=Res!$A2,Ref2!B$2:B$6),INDEX(Ref1!$B$2:$E$4,MATCH(Res!$A2,Ref1!$A$2:$A$4,0),0)))
confirm with CTRL+SHIFT+ENTER and copy down and across
That gives me the same results as you get in your results table
If the months might be in different orders then you can add something to check that too - I assumed that the types in results table row labels might be in a different order to Ref table 1, but if they are always in the same order too (as per your example) then the INDEX/MATCH part at the end can be simplified to a single range

Fill sequence in sql rows

I have a table that stores a group of attributes and keeps them ordered in a sequence. The chance exists that one of the attributes (rows) could be deleted from the table, and the sequence of positions should be compacted.
For instance, if I originally have these set of values:
+----+--------+-----+
| id | name | pos |
+----+--------+-----+
| 1 | one | 1 |
| 2 | two | 2 |
| 3 | three | 3 |
| 4 | four | 4 |
+----+--------+-----+
And the second row was deleted, the position of all subsequent rows should be updated to close the gaps. The result should be this:
+----+--------+-----+
| id | name | pos |
+----+--------+-----+
| 1 | one | 1 |
| 3 | three | 2 |
| 4 | four | 3 |
+----+--------+-----+
Is there a way to do this update in a single query? How could I do this?
PS: I'd appreciate examples for both SQLServer and Oracle, since the system is supposed to support both engines. Thanks!
UPDATE: The reason for this is that users are allowed to modify the positions at will, as well as adding or deleting new rows. Positions are shown to the user, and for that reason, these should show a consistence sequence at all times (and this sequence must be stored, and not generated on demand).

Not sure it works, But with Oracle I would try the following:
update my_table set pos = rownum;

this would work but may be suboptimal for large datasets:
SQL> UPDATE my_table t
2 SET pos = (SELECT COUNT(*) FROM my_table WHERE id <= t.id);
3 rows updated
SQL> select * from my_table;
ID NAME POS
---------- ---------- ----------
1 one 1
3 three 2
4 four 3

Do you really need the sequence values to be contiguous, or do you just need to be able to display the contiguous values? The easiest way to do this is to let the actual sequence become sparse and calculate the rank based on the order:
select id,
name,
dense_rank() over (order by pos) as pos,
pos as sparse_pos
from my_table
(note: this is an Oracle-specific query)
If you make the position sparse in the first place, this would even make re-ordering easier, since you could make each new position halfway between the two existing ones. For instance, if you had a table like this:
+----+--------+-----+
| id | name | pos |
+----+--------+-----+
| 1 | one | 100 |
| 2 | two | 200 |
| 3 | three | 300 |
| 4 | four | 400 |
+----+--------+-----+
When it becomes time to move ID 4 into position 2, you'd just change the position to 150.
Further explanation:
Using the above example, the user initially sees the following (because you're masking the position):
+----+--------+-----+
| id | name | pos |
+----+--------+-----+
| 1 | one | 1 |
| 2 | two | 2 |
| 3 | three | 3 |
| 4 | four | 4 |
+----+--------+-----+
When the user, through your interface, indicates that the record in position 4 needs to be moved to position 2, you update the position of ID 4 to 150, then re-run your query. The user sees this:
+----+--------+-----+
| id | name | pos |
+----+--------+-----+
| 1 | one | 1 |
| 4 | four | 2 |
| 2 | two | 3 |
| 3 | three | 4 |
+----+--------+-----+
The only reason this wouldn't work is if the user is editing the data directly in the database. Though, even in that case, I'd be inclined to use this kind of solution, via views and instead-of triggers.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight