I'm really confused about something named Accuracy in SQL Server.
I found some information about date and time datatypes in SQL server but I observed something named accuracy.
So please can someone help me and explain with a simple way to understand what is really accuracy means?
Massive thanks in advance.
It comes down to how many bits you have and perhaps how many you decide to use. 1-bit has 2 possible values. 2 bits has 4. 3 has 8. In general, given n bits of storage, there are 2^n possible distinct values. A byte with 8 bits can have 256 values.
For a positive integer stored in a byte, the range is 0 to 255 because zero counts as one possible value. Including negative numbers does not change the number of values. For example, a signed byte can have a range of -127 to 128. (2's compliment is convenient for hardware.)
Things are "exact" for integers with the restriction on range. For something like time or distance, we use real numbers. A property of a real numbers is that there there are an infinite number of real numbers between any two real numbers. Any if we pick a range, there are not enough bytes to represent all the real numbers between. We approximate the number by assigning some bits to an exponent and some to a mantissa.
So long story short, we are assigning a fixed number of bits to represent an infinite number of time values over a given range.
For a smalldatetime, the 1st 2 bytes are for the day and the 2nd 2 bytes are for the time. For a datetime, each are 4 bytes. The smalldatetime day part of 2 bytes allows 2^16 values or 65536 days. 65536 * (1 year/265 days) = 179 does match the year range of the smalldatetime. For the datetime day part 2^32 / 365 = 11767 years is more than the datetime range 1753 to 9999 used. The storage is there, but engineers don't have to use it.
Now for the time part. The datetime can be converted to a float. The integer part will be the day and the fraction the fraction of the day. (This does not work for datetime2.)
If we used every bit of the time part, then 2 bytes allows for 65,536 values and 4 bytes allows for 4,294,967,296 values for a single day. These are the best possible precisions for each.
24 hr * (3600 sec/ 1 hr) * (1000 ms / 1 sec) / 4,294,967,296 = .02 milliseconds
24 hr * (3600 sec/ 1 hr) / 1 sec) / 65,536 = 1.3 seconds
These are not the precisions that Microsoft engineers decided to use; however, they are the best that can be stored in this number of bytes. The choice of a 3ms precision was likely related to hardware and API restrictions at the time. (A slow HW clock might have been the only standard.) The 1 minute precision was likely a round up. Can't have a second, so let's round it to minutes.
The end result is that if you measure the time 1 million times between now and a millisecond from now and store it in a datetime, you will see only 1 or 2 distinct values stored at most. If you store it as a smalldatetime, you see at most 1 or 2 distinct values. If you see 2 values, it's because you crossed into the interval of the next value.
Clear as mud?
I was browsing msdn in regards to data type sizes in T-SQL and noticed something I'm a bit confused on.
According to http://msdn.microsoft.com/en-us/library/ms186724.aspx, datetime uses 8 bytes and stores a date from years 1753-9999 with a time precision of hh:mm:ss[.nnn]. Now if you look at date and time separately, time uses 3-5 bytes to store hh:mm:ss[.nnnnnnn] and date uses 3 bytes to store from years 0 - 9999.
What confuses me is that using date and time separately gives you a wider range of years and time with four more digits of precision than datetime, yet they both use 8 bytes? Why does datetime have a smaller range and less precision yet uses the same size to store itself?
The Datetime data type precedes the separate date and time data types. The Datetime data type uses 8 bytes, as two integers. The first integer stores 01/01/1900 as 0 any days before 1900 are stored as a negative number of days before and any date after is stored as a positive integer denoting the number of days after 01/01/1900.
The reason the date only starts at 1753 is that this is the start of the Gregorian calendar and any days before this you need to know the country to determine the date. This was the decision made by the original Sybase developers, from which Sql Server is descended.
The time integer portion stores the number of ticks since midnight. A tick is 1/300 second.
Good examples and info can be found here;
http://blogs.lessthandot.com/index.php/DataMgmt/DataDesign/how-are-dates-stored-in-sql-server
and http://karaszi.com/the-ultimate-guide-to-the-datetime-datatypes
I create a pilot logbook software application and pilots log flight time in various formats. In the US pilots log flight time typically in tenths, i.e. 90 minutes of flying would be logged as 1.5. International/European pilots typically log time in HH:MM so the 90 minutes would be logged as 1:30. Some may also want other formats such as 1 + 30.
I need a universal way of storing this value in SQL Server such that it can be converted to the display format as shown above. I'd hate to have to have two columns for every field, one is decimal time, the other is total minutes. Or I could convert somehow the decimal time into a total minutes representation and store it as an INT.
We've also discussed storing TICKS. The problem come into aggregating the values later in reports, pivot table systems such as DevExpress's XtraPivotGrid, etc. And even more of a concern is how will this data be handled on mobile devices such as the logbook apps on iPhone, Android, etc. using SQLite. In the past we had issues with Palm OS and the size of the number (integer) and overflow problems. When you add up TICKS from 30 years of a pilots flying you could end up with a huge number. Doing averages, etc....
For pilots that track flight time in decimal we use time scales such as 27-33 minutes is a .5 so if you fly for 32 minutes, you log a 0.5 or 1 hour and 33 minutes you would log it as a 1.5, so we need to apply the time scale when converting the stored value to the display value. For European it would be 1:33 for the last case that was logged as a 1.5 in the US.
How do you suggest we store a pilot's flight time in the database so it can be converted to tenths, hundredths, or HH:MM presentation? Also consider aggregates, mobile apps, etc.
Thank you.
Use the definition of TIME (or DATETIME as you wish) native to your database's variant of SQL. Transform to/from the external representations as required on output/input.
I working on a horse racing application and have the need to store elapsed times from races in a table. I will be importing data from a comma delimited file that provides the final time in one format and the interior elapsed times in another. The following is an example:
Final Time: 109.39 (1 minute, 9 seconds and 39/100th seconds)
Quarter Time: 2260 (21 seconds and 60/100th seconds)
Half Time: 4524 (45 seconds and 24/100th seconds)
Three Quarters: 5993 (59 seconds and 93/100th seconds)
I'll want to have the flexibility to easily do things like feet per seconds calculations and to convert elapsed times to splits. I'll also want to be able to easily display the times (elapsed or splits) in fifth of seconds or in hundredths.
Times in fifths: :223 :451 :564 1:091 (note the last digits are superscripts)
Times in hundredths: 22.60 :45.24 :56.93 1:09.39
Thanks in advance for your input.
Generally timespans are either stored as (1) seconds elapsed or (2) start / end datetime. Seconds elapsed can be an integer or a float / double if you require it. You could be creative / crazy and store all times as milliseconds in which case you'd only need an integer.
If you are using PostgreSQL, you can use interval datatype. Otherwise, any integer (int4, int8) or number your database supports is OK. Of course, store values on a single unit of measure: seconds, minutes, milliseconds.
It all depends on how you intend to use it, but number of elapsed seconds (perhaps as a float if necessary) is certainly a favorite.
I think the 109.39 representing 1 min 9.39 sec is pretty silly. Unambiguous, sure, historical tradition maybe, but it's miserable to do computations with that format. (Not impossible, but fixing it during import sounds easy.)
I'd store time in a decimal format of some sort -- either an integer representing hundredths-of-a-second, as all your other times are displayed, or a data-base specific decimal-aware format.
Standard floating point representations might eventually lead you to wonder why a horse that ran two laps in 20.1 seconds each took 40.200035 seconds to run both laps combined.
I want to store times in a database table but only need to store the hours and minutes.
I know I could just use DATETIME and ignore the other components of the date, but what's the best way to do this without storing more info than I actually need?
You could store it as an integer of the number of minutes past midnight:
eg.
0 = 00:00
60 = 01:00
252 = 04:12
You would however need to write some code to reconstitute the time, but that shouldn't be tricky.
If you are using SQL Server 2008+, consider the TIME datatype. SQLTeam article with more usage examples.
DATETIME start DATETIME end
I implore you to use two DATETIME values instead, labelled something like event_start and event_end.
Time is a complex business
Most of the world has now adopted the denery based metric system for most measurements, rightly or wrongly. This is good overall, because at least we can all agree that a g, is a ml, is a cubic cm. At least approximately so. The metric system has many flaws, but at least it's internationally consistently flawed.
With time however, we have; 1000 milliseconds in a second, 60 seconds to a minute, 60 minutes to an hour, 12 hours for each half a day, approximately 30 days per month which vary by the month and even year in question, each country has its time offset from others, the way time is formatted in each country vary.
It's a lot to digest, but the long and short of it is impossible for such a complex scenario to have a simple solution.
Some corners can be cut, but there are those where it is wiser not to
Although the top answer here suggests that you store an integer of minutes past midnight might seem perfectly reasonable, I have learned to avoid doing so the hard way.
The reasons to implement two DATETIME values are for an increase in accuracy, resolution and feedback.
These are all very handy for when the design produces undesirable results.
Am I storing more data than required?
It might initially appear like more information is being stored than I require, but there is a good reason to take this hit.
Storing this extra information almost always ends up saving me time and effort in the long-run, because I inevitably find that when somebody is told how long something took, they'll additionally want to know when and where the event took place too.
It's a huge planet
In the past, I have been guilty of ignoring that there are other countries on this planet aside from my own. It seemed like a good idea at the time, but this has ALWAYS resulted in problems, headaches and wasted time later on down the line. ALWAYS consider all time zones.
C#
A DateTime renders nicely to a string in C#. The ToString(string Format) method is compact and easy to read.
E.g.
new TimeSpan(EventStart.Ticks - EventEnd.Ticks).ToString("h'h 'm'm 's's'")
SQL server
Also if you're reading your database seperate to your application interface, then dateTimes are pleasnat to read at a glance and performing calculations on them are straightforward.
E.g.
SELECT DATEDIFF(MINUTE, event_start, event_end)
ISO8601 date standard
If using SQLite then you don't have this, so instead use a Text field and store it in ISO8601 format eg.
"2013-01-27T12:30:00+0000"
Notes:
This uses 24 hour clock*
The time offset (or +0000) part of the ISO8601 maps directly to longitude value of a GPS coordiate (not taking into account daylight saving or countrywide).
E.g.
TimeOffset=(±Longitude.24)/360
...where ± refers to east or west direction.
It is therefore worth considering if it would be worth storing longitude, latitude and altitude along with the data. This will vary in application.
ISO8601 is an international format.
The wiki is very good for further details at http://en.wikipedia.org/wiki/ISO_8601.
The date and time is stored in international time and the offset is recorded depending on where in the world the time was stored.
In my experience there is always a need to store the full date and time, regardless of whether I think there is when I begin the project. ISO8601 is a very good, futureproof way of doing it.
Additional advice for free
It is also worth grouping events together like a chain. E.g. if recording a race, the whole event could be grouped by racer, race_circuit, circuit_checkpoints and circuit_laps.
In my experience, it is also wise to identify who stored the record. Either as a seperate table populated via trigger or as an additional column within the original table.
The more you put in, the more you get out
I completely understand the desire to be as economical with space as possible, but I would rarely do so at the expense of losing information.
A rule of thumb with databases is as the title says, a database can only tell you as much as it has data for, and it can be very costly to go back through historical data, filling in gaps.
The solution is to get it correct first time. This is certainly easier said than done, but you should now have a deeper insight of effective database design and subsequently stand a much improved chance of getting it right the first time.
The better your initial design, the less costly the repairs will be later on.
I only say all this, because if I could go back in time then it is what I'd tell myself when I got there.
Just store a regular datetime and ignore everything else. Why spend extra time writing code that loads an int, manipulates it, and converts it into a datetime, when you could just load a datetime?
since you didn't mention it bit if you are on SQL Server 2008 you can use the time datatype otherwise use minutes since midnight
SQL Server actually stores time as fractions of a day. For example, 1 whole day = value of 1. 12 hours is a value of 0.5.
If you want to store the time value without utilizing a DATETIME type, storing the time in a decimal form would suit that need, while also making conversion to a DATETIME simple.
For example:
SELECT CAST(0.5 AS DATETIME)
--1900-01-01 12:00:00.000
Storing the value as a DECIMAL(9,9) would consume 5 bytes. However, if precision to not of utmost importance, a REAL would consume only 4 bytes. In either case, aggregate calculation (i.e. mean time) can be easily calculated on numeric values, but not on Data/Time types.
I would convert them to an integer (HH*3600 + MM*60), and store it that way. Small storage size, and still easy enough to work with.
If you are using MySQL use a field type of TIME and the associated functionality that comes with TIME.
00:00:00 is standard unix time format.
If you ever have to look back and review the tables by hand, integers can be more confusing than an actual time stamp.
Instead of minutes-past-midnight we store it as 24 hours clock, as an SMALLINT.
09:12 = 912
14:15 = 1415
when converting back to "human readable form" we just insert a colon ":" two characters from the right. Left-pad with zeros if you need to. Saves the mathematics each way, and uses a few fewer bytes (compared to varchar), plus enforces that the value is numeric (rather than alphanumeric)
Pretty goofy though ... there should have been a TIME datatype in MS SQL for many a year already IMHO ...
Try smalldatetime. It may not give you what you want but it will help you in your future needs in date/time manipulations.
Are you sure you will only ever need the hours and minutes? If you want to do anything meaningful with it (like for example compute time spans between two such data points) not having information about time zones and DST may give incorrect results. Time zones do maybe not apply in your case, but DST most certainly will.
What I think you're asking for is a variable that will store minutes as a number. This can be done with the varying types of integer variable:
SELECT 9823754987598 AS MinutesInput
Then, in your program you could simply view this in the form you'd like by calculating:
long MinutesInAnHour = 60;
long MinutesInADay = MinutesInAnHour * 24;
long MinutesInAWeek = MinutesInADay * 7;
long MinutesCalc = long.Parse(rdr["MinutesInput"].toString()); //BigInt converts to long. rdr is an SqlDataReader.
long Weeks = MinutesCalc / MinutesInAWeek;
MinutesCalc -= Weeks * MinutesInAWeek;
long Days = MinutesCalc / MinutesInADay;
MinutesCalc -= Days * MinutesInADay;
long Hours = MinutesCalc / MinutesInAnHour;
MinutesCalc -= Hours * MinutesInAnHour;
long Minutes = MinutesCalc;
An issue arises where you request for efficiency to be used. But, if you're short for time then just use a nullable BigInt to store your minutes value.
A value of null means that the time hasn't been recorded yet.
Now, I will explain in the form of a round-trip to outer-space.
Unfortunately, a table column will only store a single type. Therefore, you will need to create a new table for each type as it is required.
For example:
If MinutesInput = 0..255 then use TinyInt (Convert as described above).
If MinutesInput = 256..131071 then use SmallInt (Note: SmallInt's min
value is -32,768. Therefore, negate and add 32768 when storing and
retrieving value to utilise full range before converting as above).
If MinutesInput = 131072..8589934591 then use Int (Note: Negate and add
2147483648 as necessary).
If MinutesInput = 8589934592..36893488147419103231 then use BigInt
(Note: Add and negate 9223372036854775808 as necessary).
If MinutesInput > 36893488147419103231 then I'd personally use
VARCHAR(X) increasing X as necessary since a char is a byte. I shall
have to revisit this answer at a later date to describe this in full
(or maybe a fellow stackoverflowee can finish this answer).
Since each value will undoubtedly require a unique key, the efficiency of the database will only be apparent if the range of the values stored are a good mix between very small (close to 0 minutes) and very high (Greater than 8589934591).
Until the values being stored actually reach a number greater than 36893488147419103231 then you might as well have a single BigInt column to represent your minutes, as you won't need to waste an Int on a unique identifier and another int to store the minutes value.
The saving of time in UTC format can help better as Kristen suggested.
Make sure that you are using 24 hr clock because there is no meridian AM or PM be used in UTC.
Example:
4:12 AM - 0412
10:12 AM - 1012
2:28 PM - 1428
11:56 PM - 2356
Its still preferrable to use standard four digit format.
Store the ticks as a long/bigint, which are currently measured in milliseconds. The updated value can be found by looking at the TimeSpan.TicksPerSecond value.
Most databases have a DateTime type that automatically stores the time as ticks behind the scenes, but in the case of some databases e.g. SqlLite, storing ticks can be a way to store the date.
Most languages allow the easy conversion from Ticks → TimeSpan → Ticks.
Example
In C# the code would be:
long TimeAsTicks = TimeAsTimeSpan.Ticks;
TimeAsTimeSpan = TimeSpan.FromTicks(TimeAsTicks);
Be aware though, because in the case of SqlLite, which only offers a small number of different types, which are; INT, REAL and VARCHAR It will be necessary to store the number of ticks as a string or two INT cells combined. This is, because an INT is a 32bit signed number whereas BIGINT is a 64bit signed number.
Note
My personal preference however, would be to store the date and time as an ISO8601 string.
IMHO what the best solution is depends to some extent on how you store time in the rest of the database (and the rest of your application)
Personally I have worked with SQLite and try to always use unix timestamps for storing absolute time, so when dealing with the time of day (like you ask for) I do what Glen Solsberry writes in his answer and store the number of seconds since midnight
When taking this general approach people (including me!) reading the code are less confused if I use the same standard everywhere