How do RRD values in a database dump translate to the input values? - database

I am having trouble understanding the values that I have saved in my Round Robin Database. I do a dump with rrdtool dump mydatabase and I get a dump of the data. I found the most recent update, and matched it to my rrd update command:
$rrdupdate --template=var1:var2:var3:var4:var5 N:15834740:839964:247212:156320:13493356
In my dump at the matching timestamp, I find these values:
<!-- 2016-12-01 10:30:00 CST / 1480609800 --> <row><v>9.0950245287e+04</v><v>4.8264158237e+03</v><v>1.4182428703e+03</v><v>8.9785764359e+02</v><v>7.7501969607e+04</v></row>
The first value is supposed to be var1. Out of scientific notation, that's 90,950.245287, which does not match up at all to my input value. (None of them are decimal.)
Is there something special I have to do to be able to convert values from my dump to get the standard value that I entered?

I can't give you specifics for your case, as you have not shown the full definition of your RRD file (internals, DS definition, etc), however...
Values stored in an RRDTool database are subject to Data Normalisation, and are then converted to Rates (unless the DS is of type Gauge in which case they are assumed to be rates already).
Normalisation is when the values are adjusted on a linear basis to make them fit exactly into the time sequence as defined by the Interval (which is often 300 seconds).
If you want to see the values stored exactly as you write them, you need to set the DS type to 'gauge', and make Normalisation a null step. The only way to do the latter is to store the values exactly on a time boundary. So, if the Interval is 300s, then store at 12:00:00, 12:05:00, and so on - otherwise the values will be adjusted.
There is a lot more information about Normalisation - what it is, and why it is done - in Alex van den Bogaerdt's tutorial

Related

Anylogic: How to create plot from database table?

In my Anylogic model I succesfully create plots of datasets that count the number of trucks arriving from terminals each hour in my simulation. Now, I want to add the actual/"observed" number of trucks arriving at a terminal, to compare my simulation to these numbers. I added these numbers in a database table (see picture below). Is there a simple way of adding this data to the plot?
I tried it by creating a variable that reads the database table for every hour and adding that to a dataset (like can be seen in the pictures below), but this did not work unfortunately (the plot was empty).
Maybe simply delete the variable and fill the dataset at the start of the model by looping through the dbase table data. Use the dbase query wizard to create a for-loop. Something like this should work:
int numEntries = (int) selectFrom(observed_arrivals).count();
DataSet myDataSet = new DataSet(numEntries);
List<Tuple> rows = selectFrom(observed_arrivals).list();
for (Tuple
row : rows) {
myDataSet.add(row.get( observed_arrivals.hour ), row.get( observed_arrivals.terminal_a ));
}
myChart.addDataSet(myDataSet);
You don't explain why it "didn't work" (what errors/problems did you get?), nor where you defined these elements.
(1) Since you want both observed (empirical) and simulated arrivals per terminal, datasets for each should be in the Terminal agent. And then the replicated plot (in Main) can have two data entries referring to data sets terminals(index).observedArrivals and terminals(index).simulatedArrivals or whatever you name them.
(2) Using getHourOfDay to add to the observed dataset is wrong because that just returns 0-23 (i.e., the hour in the current day for the current model date). Your database table looks like it has hours since model start, so you just want time(HOUR) to get the model time in elapsed hours (irrespective of what the model time unit is). Or possibly time(HOUR) - 1 if you only want to update the empirical arrivals for the hour at the end of that hour (i.e., at the same time that you updated the simulated arrivals).
(3) Using a Variable to get the database value each hour doesn't work because a variable's initial value is only evaluated once at model initialisation. You want an hourly cyclic Event in Terminal instead which adds the relevant row's value. (You need to use the Insert Database Query wizard to generate the relevant Java code for the query you need in the event's action.)
(4) Because you have a database table with specifically-named columns for each terminal (columns terminal_a and presumably terminal_b etc.) that makes it slightly more awkward. (This isn't proper relational table design where, instead of 4 columns for the 4 terminals, you'd instead have two columns for terminal_id and observed_value with a row for each time period and terminal combination.)
So your database query expression (in your Terminal agents) will need to use the SQL format (not the QueryDSL format) so that you can 'stitch in' the correct column name into the SQL.

indexed query to decimate time series results

The context here is I'm scoping a design that slices time-series data at user-defined intervals - too many permutations to simply roll up the data (eg, a 2nd hourly table, rolled up as (or after) the data ingest). I am not a database expert, so am hoping to learn if there are standard approaches to this that rely on table indexes, not duplicating the data in new table/collection, or otherwise encoding it a priori by file structure, etc. Especially if there is a particular db suited to or supporting this load. A prototype of the backend is in Mongo, but we can easily pivot to a more suited store at this time.
QUESTION: In any mainstream database or similar, is it possible to query a ~time series over a given time window, (efficiently) returning only data points at a consistent interval? (what db and example query, specifically).
My data is a few 10's of gb today, but growing if we're successful. I'd expect indexes against timestamp to continue to fit ~ok in memory. A custom file-based schema such as using parquet might work, but an off the shelf DB would be ideal. By consistent interval, i mean some "skip" meaningful to a human, such as
"every nth value", if the data could be assumed to be at a reliable cadence
or perhaps "first sample per hour", if not and the timestamps were epoch times
Eg, if my data is
ts
value
1001
1
1002
2
1003
3
... continuous ...
1010
10
1011
11
... etc ...
... large data set
and query had the parameters:
- skip_value:10
- ts:
- after:1000
- before:2000
the returned set would be:
[ (1001, 1), (1011,11), (1021,21) .... (1991,991) ]

SQL Server query that returns data between two date times with format 01/07/2020 01:01:01 a. m

I've been having problems with a query that returns data between two date times, the query that I'm trying to fix is this one
pay.date BETWEEN '01/06/2020 00:28:46 a. m.' AND '01/06/2020 10:38:45 a. m.'
That query does not detect the a. m. part and if I have a payment at 10 am and 10 pm it will detect both payments as the t. t. part is not detected, I've been searching for a while now with no luck, thanks in advance :)
Do the filtering by an actual datetime type:
cast(replace(replace(pay.date, ' a. m.', 'am'), ' p. m.', 'pm') as datetime)
It might be better to use convert() so you can specify the proper format. If you can't supply the date literals in a readily convertible format then do a similar replace and cast on those too.
Use a literal format that is unambiguous and not dependent on runtime or connection settings. More info in Tibor's discussion.
In this case:
where pay.date between '20200601 00:28:46' and '20200601 10:38:45'
Notice that I assume June, not January - adjust as needed. Between is inclusive and be certain that you understand the limitations of the datatype for pay.date. If datetime, the values are accurate to 3ms. Verify that your data is consistent with your assumption about accuracy to seconds.

Pervasive PSQL Control Centre / Currency data type

Having issues updating a Pervasive PSQL table using Pervasive Control Centre and wonder if anyone can point me in the right direction. I'm struggling to update a field in the table whose type is '254-VB Currency'.
Sample query:
Update TABLE set "remBal" = 100.00 where 'Posting' = 215288;
The value that ends up in the remBal field is 463673729135463.6288
Pervasive version is v10.30. Updating via e.g. VAccess control works fine. It's just Pervasive Control Centre that doesn't.
The VAccess control supports more data types than the standard PSQL engine does. The VB Currency data type is not one that's natively supported in PSQL.
According to MSDN, the Currency data type is defined as:
Currency variables are stored as 64-bit (8-byte) numbers in an integer
format, scaled by 10,000 to give a fixed-point number with 15 digits
to the left of the decimal point and 4 digits to the right. This
representation provides a range of -922,337,203,685,477.5808 to
922,337,203,685,477.5807.
What I would suggest, is enter 100.00 to the database using VAccess, then look at the value in Control Center. You can then use that value in your SQL statement. It's not pretty but it might work.

SQL DataType - How to store a year?

I need to insert a year(eg:1988 ,1990 etc) in a database. When I used Date or Datetime
data type, it is showing errors. Which datatype should I use.
regular 4 byte INT is way too big, is a waste of space!
You don't say what database you're using, so I can't recommend a specific datatype. Everyone is saying "use integer", but most databases store integers as 4 bytes, which is way more than you need. You should use a two byte integer (smallint on SQL Server), which will conserve space.
If you need to store a year in the database, you would either want to use an Integer datatype (if you are dead set on only storing the year) or a DateTime datatype (which would involve storing a date that basically is 1/1/1990 00:00:00 in format).
Hey,you can Use year() datatype in MySQL
It is available in two-digit or four-digit format.
Note: Values allowed in four-digit format: 1901 to 2155. Values allowed in two-digit format: 70 to 69, representing years from 1970 to 2069
Storing a "Year" in MSSQL would ideally depend on what you are doing with it and what the meaning of that "year" would be to your application and database. That being said there are a few things to state here. There is no "DataType" for Year as of 2012 in MSSQL. I would lean toward using SMALLINT as it is only 2 bytes (saving you 2 of the 4 bytes that INT demands). Your limitation is that you can not have a year older than 32767 (as of SQL Server 2008R2). I really do not think SQL will be the database of choice ten thousand years from now let alone 32767. You may consider INT as the Year() function in MSSQL does convert the data type "DATE" to an INT. Like I said, it depends on where you are getting the data and where it is going, but SMALLINT should be just fine. INT would be overkill ... unless you have other reasons like the one I mentioned above or if the code requirements need it in INT form (e.g. integrating with existing application). Most likely SMALLINT should be just fine.
Just a year, nothing else ?
Why not use a simple integer ?
Use integer if all you need to store is the year. You can also use datetime if you think there will be date based calculations while querying this column
Storage may be only part of the issue. How will this value be used in a query?
Is it going to be compared with another date-time data types, or will all the associated rows also have numeric values?
How would you deal with a change to the requirements? How easily could you react to a request to replace the year with a smaller time slice? i.e. Now they want it broken down by quarters?
A numeric type can be easily used in a date time query by having a look-up table to join with containing things like the start and stop dates (1/1/X to 12/31/x), etc..
I don't think using an integer or any subtype of integer is a good choice. Sooner or later you will have to do other date like operations on it. Also in 2019 let's not worry too much about space. See what those saved 2 bytes costed us in 2000.
I suggest use a date of year + 0101 converted to a true date. Similarly if you need to store a month of a year store year + month + 01 as a true date.
If you have done that you will be able to properly do "date stuff" on it later on

Resources