AVCodec PTS timestamps not starting at 0 - c

I noticed that with some video files the PTS timestamps returned inside the AVPacket structure do not start at 0, but some time later. E.g. at 3.128 or something. 99% of the video files I tested has PTS timestamps starting at 0 but very few files have some strange timestamps that start at 3.128 or 1.2 or something. How am I supposed to handle these cases? Should I just record the PTS timestamp of the very first packet and then subtract this PTS from all following timestamp values to get a 0-based PTS value? Or what should I do with these non-0-based timestamps? Thanks for your help!

Libavcodec/avformat is just giving you the data that's in the file. Unfortunately (or perhaps fortunately, depending on your perspective), many file formats do not require timestamps to start at 0. In fact it can be important to have them start at other values if several files each make up part of a longer stream and you want to be able to non-destructively put them back together.
If you want 0-based timestamps, then like you said, you'll need to save the lowest/first timestamp and subtract that value from all timestamps. Note however that for some really ugly formats (like DVD video) it's common for timestamps to reset in the middle of the content, and this could even result in getting negative timestamps with your approach. If you expect you might be dealing with such content, you'll need to detect discontinuities and patch them up. Last I worked with avcodec/avformat, they didn't have a feature to do this for you automatically, but they might now. I'd look into it if you think you might need that.

Related

ArduinoBLE.h - multiple values in one characteristic

I recently read the documentation of the ArduinoBLE.h library. Under "Service design patterns" it is noted that it is possible to write multiple values to one characteristic:
How can I achieve that? Unfortunately I cannot find any information about this in the documentation.
I want to send all accelerometer data in one characteristic and all gyroscope data respectively along with a time stamp. This results in two characteristics
accChara: [time stamp, xAcc, yAcc, zAcc] and
gyroChara: [time stamp, xGyro, yGyro, zGyro],
where time stamp is an unsigned long (via millis()) and the values read are floats (note: I am using the Arduino_LSM9DS1.h library for the IMU).
The documentation states (as posted in your image):
The accelerometer characteristic above, for example, takes 11 bytes as a ASCII-encoded string.
They just created a comma separated string containing all their data ("200,133,150") and transmitted it using writeValue. You could do the same for your values and convert them back to numbers on your receiving end. Another way would be to use a struct to hold and send your data.

How to compress/archive a temperature curve effectively?

Summary: The industrial thermometer is used to sample temperature at the technology device. For few months, the samples are simply stored in the SQL database. Are there any well-known ways to compress the temperature curve so that much longer history could be stored effectively (say for the audit purpose)?
More details: Actually, there are much more thermometers, and possibly other sensors related to the technology. And there are well known time intervals where the curve belongs to a batch processed on the machine. The temperature curves should be added to the batch documentation.
My idea was that the temperature is a smooth function that could be interpolated somehow -- say the way a sound is compressed using MP3 format. The compression need not to be looseless. However, it must be possible to reconstruct the temperature curve (not necessarily the identical sample values, and the identical sampling interval) -- say, to be able to plot the curve or to tell what was the temperature in certain time.
The raw sample values from the SQL table would be processed, the compressed version would be stored elsewhere (possibly also in SQL database, as a blob), and later the raw samples can be deleted to save the database space.
Is there any well-known and widely used approach to the problem?
A simple approach would be code the temperature into a byte or two bytes, depending on the range and precision you need, and then to write the first temperature to your output, followed by the difference between temperatures for all the rest. For two-byte temperatures you can restrict the range some and write one or two bytes depending on the difference with a variable-length integer. E.g. if the high bit of the first byte is set, then the next byte contains 8 more bits of difference, allowing for 15 bits of difference. Most of the time it will be one byte, based on your description.
Then take that stream and feed it to a standard lossless compressor, e.g. zlib.
Any lossiness should be introduced at the sampling step, encoding only the number of bits you really need to encode the required range and precision. The rest of the process should then be lossless to avoid systematic drift in the decompressed values.
Subtracting successive values is the simplest predictor. In that case the prediction of the next value is the value before it. It may also be the most effective, depending on the noisiness of your data. If your data is really smooth, then you could try a higher-order predictor to see if you get better performance. E.g. a predictor for the next point using the last two points is 2a - b, where a is the previous point and b is the point before that, or using the last three points 3a - 3b + c, where c is the point before b. (These assume equal time steps between each.)

Is there an easy way to get the percentage of successful reads of last x minutes?

I have a setup with a Beaglebone Black which communicates over I²C with his slaves every second and reads data from them. Sometimes the I²C readout fails though, and I want to get statistics about these fails.
I would like to implement an algorithm which displays the percentage of successful communications of the last 5 minutes (up to 24 hours) and updates that value constantly. If I would implement that 'normally' with an array where I store success/no success of every second, that would mean a lot of wasted RAM/CPU load for a minor feature (especially if I would like to see the statistics of the last 24 hours).
Does someone know a good way to do that, or can anyone point me in the right direction?
Why don't you just implement a low-pass filter? For every successfull transfer, you push in a 1, for every failed one a 0; the result is a number between 0 and 1. Assuming that your transfers happen periodically, this works well -- and you just have to adjust the cutoff frequency of that filter to your desired "averaging duration".
However, I can't follow your RAM argument: assuming you store one byte representing success or failure per transfer, which you say happens every second, you end up with 86400B per day -- 85KB/day is really negligible.
EDIT Cutoff frequency is something from signal theory and describes the highest or lowest frequency that passes a low or high pass filter.
Implementing a low-pass filter is trivial; something like (pseudocode):
new_val = 1 //init with no failed transfers
alpha = 0.001
while(true):
old_val=new_val
success=do_transfer_and_return_1_on_success_or_0_on_failure()
new_val = alpha * success + (1-alpha) * old_val
That's a single-tap IIR (infinite impulse response) filter; single tap because there's only one alpha and thus, only one number that is stored as state.
EDIT2: the value of alpha defines the behaviour of this filter.
EDIT3: you can use a filter design tool to give you the right alpha; just set your low pass filter's cutoff frequency to something like 0.5/integrationLengthInSamples, select an order of 0 for the IIR and use an elliptic design method (most tools default to butterworth, but 0 order butterworths don't do a thing).
I'd use scipy and convert the resulting (b,a) tuple (a will be 1, here) to the correct form for this feedback form.
UPDATE In light of the comment by the OP 'determine a trend of which devices are failing' I would recommend the geometric average that Marcus Müller ꕺꕺ put forward.
ACCURATE METHOD
The method below is aimed at obtaining 'well defined' statistics for performance over time that are also useful for 'after the fact' analysis.
Notice that geometric average has a 'look back' over recent messages rather than fixed time period.
Maintain a rolling array of 24*60/5 = 288 'prior success rates' (SR[i] with i=-1, -2,...,-288) each representing a 5 minute interval in the preceding 24 hours.
That will consume about 2.5K if the elements are 64-bit doubles.
To 'effect' constant updating use an Estimated 'Current' Success Rate as follows:
ECSR = (t*S/M+(300-t)*SR[-1])/300
Where S and M are the count of errors and messages in the current (partially complete period. SR[-1] is the previous (now complete) bucket.
t is the number of seconds expired of the current bucket.
NB: When you start up you need to use 300*S/M/t.
In essence the approximation assumes the error rate was steady over the preceding 5 - 10 minutes.
To 'effect' a 24 hour look back you can either 'shuffle' the data down (by copy or memcpy()) at the end of each 5 minute interval or implement a 'circular array by keeping track of the current bucket index'.
NB: For many management/diagnostic purposes intervals of 15 minutes are often entirely adequate. You might want to make the 'grain' configurable.

Loading tiles for a 2D game

Im trying to make an 2D online game (with Z positions), and currently im working with loading a map from a txt file. I have three different map files. One contains an int for each tile saying what kind of floor there is, one saying what kind of decoration there is, and one saying what might be covering the tile. The problem is that the current map (20, 20, 30) takes 200 ms to load, and I want it to be much much bigger. I have tried to find a good solution for this and have so far come up with some ideas.
Recently I'v thought about storing all tiles in separate files, one file per tile. I'm not sure if this is a good idea (it feels wrong somehow), but it would mean that I wouldn't have to store any unneccessary tiles as "-1" in a text file and I would be able to just pick the right tile from the folder easily during run time (read the file named mapXYZ). If the tile is empty I would just be able to catch the FileNotFoundException. Could anyone tell me a reason for this being a bad solution? Other solutions I'v thought about would be to split the map into smaller parts or reading the map during startup in a BackgroundWorker.
Try making a much larger map in the same format as your current one first - it may be that the 200ms is mostly just overhead of opening and initial processing of the file.
If I'm understanding your proposed solution (opening one file per X,Y or X,Y,Z coordinate of a single map), this is a bad idea for two reasons:
There will be significant overhead to opening so many files.
Catching a FileNotFoundException and eating it will be significantly slower - there is actually a lot of overhead with catching exceptions, so you shouldn't rely on them to perform application logic.
Are you loading the file from a remote server? If so, that's why it's taking so long. Instead you should embed the file into the game. I'm saying this because you probably take 2-3 bytes per tile, so the file's about 30kb and 200ms sounds like a reasonable download time for that size of file (including overhead etc, and depending on your internet connection).
Regarding how to lower the filesize - there are two easy techniques I can think of that will decrease the filesize a bit:
1) If you have mostly empty squares and only some significant ones, your map is what is often referred to as 'sparse'. When storing a sparse array of data you can use a simple compression technique (formally known as 'run-length encoding') where each time you come accross empty squares, you specify how many of them there are. So for example instead of {0,0,0,0,0,0,0,0,0,0,1,1,2,3,0,0,0,0,0,0,0,0,0,0,0,0,1} you could store {10 0's, 1, 1, 2, 3, 12 0's, 1}
2) To save space, I recommend that you store everything as binary data. The exact setup of the file mainly depends on how many possible tile types there are, but this is a better solution than storing the ascii characters corresponding to the base-10 representation of the numers, separated by delimiters.
Example Binary Format
File is organized into segments which are 3 or 4 bytes long, as explained below.
First segment indicates the version of the game for which the map was created. 3 bytes long.
Segments 2, 3, and 4 indicate the dimensions of the map (x, y, z). 3 bytes long each.
The remaining segments all indicate either a tile number and is 3 bytes long with an MSB of 0. The exception to this follows.
If one of the tile segments is an empty tile, it is 4 bytes long with an MSB of 1, and indicates the number of empty tiles including that tile that follow.
The reason I suggest the MSB flag is so that you can distinguish between segments which are for tiles, and segments which indicate the number of empty tiles which follow that segment. For those segments I increase the length to 4 bytes (you might want to make it 5) so that you can store larger numbers of empty tiles per segment.

Best way to store time (hh:mm) in a database

I want to store times in a database table but only need to store the hours and minutes.
I know I could just use DATETIME and ignore the other components of the date, but what's the best way to do this without storing more info than I actually need?
You could store it as an integer of the number of minutes past midnight:
eg.
0 = 00:00
60 = 01:00
252 = 04:12
You would however need to write some code to reconstitute the time, but that shouldn't be tricky.
If you are using SQL Server 2008+, consider the TIME datatype. SQLTeam article with more usage examples.
DATETIME start DATETIME end
I implore you to use two DATETIME values instead, labelled something like event_start and event_end.
Time is a complex business
Most of the world has now adopted the denery based metric system for most measurements, rightly or wrongly. This is good overall, because at least we can all agree that a g, is a ml, is a cubic cm. At least approximately so. The metric system has many flaws, but at least it's internationally consistently flawed.
With time however, we have; 1000 milliseconds in a second, 60 seconds to a minute, 60 minutes to an hour, 12 hours for each half a day, approximately 30 days per month which vary by the month and even year in question, each country has its time offset from others, the way time is formatted in each country vary.
It's a lot to digest, but the long and short of it is impossible for such a complex scenario to have a simple solution.
Some corners can be cut, but there are those where it is wiser not to
Although the top answer here suggests that you store an integer of minutes past midnight might seem perfectly reasonable, I have learned to avoid doing so the hard way.
The reasons to implement two DATETIME values are for an increase in accuracy, resolution and feedback.
These are all very handy for when the design produces undesirable results.
Am I storing more data than required?
It might initially appear like more information is being stored than I require, but there is a good reason to take this hit.
Storing this extra information almost always ends up saving me time and effort in the long-run, because I inevitably find that when somebody is told how long something took, they'll additionally want to know when and where the event took place too.
It's a huge planet
In the past, I have been guilty of ignoring that there are other countries on this planet aside from my own. It seemed like a good idea at the time, but this has ALWAYS resulted in problems, headaches and wasted time later on down the line. ALWAYS consider all time zones.
C#
A DateTime renders nicely to a string in C#. The ToString(string Format) method is compact and easy to read.
E.g.
new TimeSpan(EventStart.Ticks - EventEnd.Ticks).ToString("h'h 'm'm 's's'")
SQL server
Also if you're reading your database seperate to your application interface, then dateTimes are pleasnat to read at a glance and performing calculations on them are straightforward.
E.g.
SELECT DATEDIFF(MINUTE, event_start, event_end)
ISO8601 date standard
If using SQLite then you don't have this, so instead use a Text field and store it in ISO8601 format eg.
"2013-01-27T12:30:00+0000"
Notes:
This uses 24 hour clock*
The time offset (or +0000) part of the ISO8601 maps directly to longitude value of a GPS coordiate (not taking into account daylight saving or countrywide).
E.g.
TimeOffset=(±Longitude.24)/360
...where ± refers to east or west direction.
It is therefore worth considering if it would be worth storing longitude, latitude and altitude along with the data. This will vary in application.
ISO8601 is an international format.
The wiki is very good for further details at http://en.wikipedia.org/wiki/ISO_8601.
The date and time is stored in international time and the offset is recorded depending on where in the world the time was stored.
In my experience there is always a need to store the full date and time, regardless of whether I think there is when I begin the project. ISO8601 is a very good, futureproof way of doing it.
Additional advice for free
It is also worth grouping events together like a chain. E.g. if recording a race, the whole event could be grouped by racer, race_circuit, circuit_checkpoints and circuit_laps.
In my experience, it is also wise to identify who stored the record. Either as a seperate table populated via trigger or as an additional column within the original table.
The more you put in, the more you get out
I completely understand the desire to be as economical with space as possible, but I would rarely do so at the expense of losing information.
A rule of thumb with databases is as the title says, a database can only tell you as much as it has data for, and it can be very costly to go back through historical data, filling in gaps.
The solution is to get it correct first time. This is certainly easier said than done, but you should now have a deeper insight of effective database design and subsequently stand a much improved chance of getting it right the first time.
The better your initial design, the less costly the repairs will be later on.
I only say all this, because if I could go back in time then it is what I'd tell myself when I got there.
Just store a regular datetime and ignore everything else. Why spend extra time writing code that loads an int, manipulates it, and converts it into a datetime, when you could just load a datetime?
since you didn't mention it bit if you are on SQL Server 2008 you can use the time datatype otherwise use minutes since midnight
SQL Server actually stores time as fractions of a day. For example, 1 whole day = value of 1. 12 hours is a value of 0.5.
If you want to store the time value without utilizing a DATETIME type, storing the time in a decimal form would suit that need, while also making conversion to a DATETIME simple.
For example:
SELECT CAST(0.5 AS DATETIME)
--1900-01-01 12:00:00.000
Storing the value as a DECIMAL(9,9) would consume 5 bytes. However, if precision to not of utmost importance, a REAL would consume only 4 bytes. In either case, aggregate calculation (i.e. mean time) can be easily calculated on numeric values, but not on Data/Time types.
I would convert them to an integer (HH*3600 + MM*60), and store it that way. Small storage size, and still easy enough to work with.
If you are using MySQL use a field type of TIME and the associated functionality that comes with TIME.
00:00:00 is standard unix time format.
If you ever have to look back and review the tables by hand, integers can be more confusing than an actual time stamp.
Instead of minutes-past-midnight we store it as 24 hours clock, as an SMALLINT.
09:12 = 912
14:15 = 1415
when converting back to "human readable form" we just insert a colon ":" two characters from the right. Left-pad with zeros if you need to. Saves the mathematics each way, and uses a few fewer bytes (compared to varchar), plus enforces that the value is numeric (rather than alphanumeric)
Pretty goofy though ... there should have been a TIME datatype in MS SQL for many a year already IMHO ...
Try smalldatetime. It may not give you what you want but it will help you in your future needs in date/time manipulations.
Are you sure you will only ever need the hours and minutes? If you want to do anything meaningful with it (like for example compute time spans between two such data points) not having information about time zones and DST may give incorrect results. Time zones do maybe not apply in your case, but DST most certainly will.
What I think you're asking for is a variable that will store minutes as a number. This can be done with the varying types of integer variable:
SELECT 9823754987598 AS MinutesInput
Then, in your program you could simply view this in the form you'd like by calculating:
long MinutesInAnHour = 60;
long MinutesInADay = MinutesInAnHour * 24;
long MinutesInAWeek = MinutesInADay * 7;
long MinutesCalc = long.Parse(rdr["MinutesInput"].toString()); //BigInt converts to long. rdr is an SqlDataReader.
long Weeks = MinutesCalc / MinutesInAWeek;
MinutesCalc -= Weeks * MinutesInAWeek;
long Days = MinutesCalc / MinutesInADay;
MinutesCalc -= Days * MinutesInADay;
long Hours = MinutesCalc / MinutesInAnHour;
MinutesCalc -= Hours * MinutesInAnHour;
long Minutes = MinutesCalc;
An issue arises where you request for efficiency to be used. But, if you're short for time then just use a nullable BigInt to store your minutes value.
A value of null means that the time hasn't been recorded yet.
Now, I will explain in the form of a round-trip to outer-space.
Unfortunately, a table column will only store a single type. Therefore, you will need to create a new table for each type as it is required.
For example:
If MinutesInput = 0..255 then use TinyInt (Convert as described above).
If MinutesInput = 256..131071 then use SmallInt (Note: SmallInt's min
value is -32,768. Therefore, negate and add 32768 when storing and
retrieving value to utilise full range before converting as above).
If MinutesInput = 131072..8589934591 then use Int (Note: Negate and add
2147483648 as necessary).
If MinutesInput = 8589934592..36893488147419103231 then use BigInt
(Note: Add and negate 9223372036854775808 as necessary).
If MinutesInput > 36893488147419103231 then I'd personally use
VARCHAR(X) increasing X as necessary since a char is a byte. I shall
have to revisit this answer at a later date to describe this in full
(or maybe a fellow stackoverflowee can finish this answer).
Since each value will undoubtedly require a unique key, the efficiency of the database will only be apparent if the range of the values stored are a good mix between very small (close to 0 minutes) and very high (Greater than 8589934591).
Until the values being stored actually reach a number greater than 36893488147419103231 then you might as well have a single BigInt column to represent your minutes, as you won't need to waste an Int on a unique identifier and another int to store the minutes value.
The saving of time in UTC format can help better as Kristen suggested.
Make sure that you are using 24 hr clock because there is no meridian AM or PM be used in UTC.
Example:
4:12 AM - 0412
10:12 AM - 1012
2:28 PM - 1428
11:56 PM - 2356
Its still preferrable to use standard four digit format.
Store the ticks as a long/bigint, which are currently measured in milliseconds. The updated value can be found by looking at the TimeSpan.TicksPerSecond value.
Most databases have a DateTime type that automatically stores the time as ticks behind the scenes, but in the case of some databases e.g. SqlLite, storing ticks can be a way to store the date.
Most languages allow the easy conversion from Ticks → TimeSpan → Ticks.
Example
In C# the code would be:
long TimeAsTicks = TimeAsTimeSpan.Ticks;
TimeAsTimeSpan = TimeSpan.FromTicks(TimeAsTicks);
Be aware though, because in the case of SqlLite, which only offers a small number of different types, which are; INT, REAL and VARCHAR It will be necessary to store the number of ticks as a string or two INT cells combined. This is, because an INT is a 32bit signed number whereas BIGINT is a 64bit signed number.
Note
My personal preference however, would be to store the date and time as an ISO8601 string.
IMHO what the best solution is depends to some extent on how you store time in the rest of the database (and the rest of your application)
Personally I have worked with SQLite and try to always use unix timestamps for storing absolute time, so when dealing with the time of day (like you ask for) I do what Glen Solsberry writes in his answer and store the number of seconds since midnight
When taking this general approach people (including me!) reading the code are less confused if I use the same standard everywhere

Resources