Field data type / validations questions - database

I have few questions on what data types to use and how to define some fields from my site. My current schema is in MySQL but in process of changing to PostregSQL.
First & Last name -> Since I have multi-lang, tables all support UTF-8, but do i need to declare them as nvarchar in-case a user enters a Chinese name? If so, how do i enforce field validation if it is set to accept alphabets only as i assume those are English alphabets and not validating for valid chinese or arabic alphabets? And i don't think PostregSQL supports nvarchar anyways?
To store current time line - > Example I work in company A from Jan 2009 to Present. So i assume there will be 3 field for this: timeline_to, timeline_from, time_line present where to & from are month/year varchars and present is just a flag to set the current date?
User passwords. i am using SHA 256 + salting. so i have 2 fields declared as follows:
password_hash - varchar (64)
password_salt- varchar (64)
Does this work if the user password needs to be between 8 and 32 chars long?
birth time -> I need to record birth time for the application to calculate some astrological values. so that means hour, minute and am/pm. So best to store these are 3 separate single select lists with varchar or use a time data type in the back end and allow users to use single select list in front end?
Lastly for birth month and year only, are these int or varchar if i store them in separate rows? They all have primary keys of int for reporting purposes so int makes more sense? or should i store them in 1 field only as date type?

NCHAR, not NVARCHAR.
Never make anything variable that you can make fixed; it is an added burden to pack/unpack on every access. Which means never, ever use var for indexed columns, you will have a very sluggish index. Disk space is cheap these days.
you need a Language column at the Person level that tells you what language to use in your various parsing and validation requirements.
Let's say you have Person, Employer, and Employment tables. The columns you discuss are in Employment.
you need a StartDate column and EndDate column, they are DATETIME datatype.
You do not need "present" as a separate column. "Present" is always the value of the newest Employment row, unless set to something different. Set a Default of the highest date the db can handle, eg. 9999-12-31; which can be overridden by an explicit entry.
No. You only need one CHAR(256) column. Hank has explained it.
For any component of a date or time, use the DATETIME datatype. That is what it is there for. The database handles it consistently, and indexes it perfectly. You perform DATE arithmetic on it, using db various functions(). And you avoid all the problems of coding it as INTs, etc (no invalid dates or times allowed).
BirthDateTime is one DATETIME column.

I have no idea, never dealt with that field much.
You might consider allowing NULL here and using it as a special meaning for Present. If your application logic sees a non-null start date and a null end date, you can infer this. If they are both NULL, then no information can be inferred.
Since you're hashing, you'll always get a 256-bit hex string as the output no matter what the input is, so yes, 8-32 character passwords will all work.
Use a DATETIME in the backend. You can do things like MONTH() to extract the parts right in your SQL syntax. Of course, you'll have to format the date just right for SQL to accept it, but that's not too hard.
Again, all extractable with the DATETIME functions in SQL.

Related

Can't group on this selection using Dates column from SQL in Pivot Table in Excel

I have a pivot table in Excel, which I'd like to be able to performing Grouping on using the SaleDate column.
However, when I've created my Pivot Table, right click an element in the field and choose Grouping... I get the error that:
Cannot Group on this selection
Which I've figured is because there is either
1) Blanks in the column, or
2) The column is not of date type Date in Excel
I've copied the whole column to Notepad++ and performed a Find what: (a blankspace) but that gave nothing in return, i.e. there are no blank spaces in the columns.
That leaves option number 2, and since I can't filter the SaleDate column on Year or Month it seems to in fact be interpreted as a text rather than a Date.
I'm using a SQL database as a source, which I have tried to adjust to parry this (my raw data to the SQL is of data type numeric, hence I first need to convert it to varchar and subsequently to date (note that these are Three different approaches I have used to adjust the date in SQL. I have noticed that the table to which I save the data is indeed of data type date in SQL):
left(convert(date,convert(varchar,Rawdate,110),110),7) as SaleDate
convert(date,convert(varchar,Rawdate,112),112) as SaleDate
convert(date,convert(varchar,Rawdate,110),110) as SaleDate
which returns, in order, yyyy-mm, yyyy-mm-dd, yyyy-mm-dd but none of these works to either Group on in the Pivot Table, or filter on Months or year in Excel.
While I never worked specifically on Excels that utilize SQL-Server directly, I know SQL-Server has many date & time types, unlike .NET's C# which has fewer, or Excel which has only one(1). The types of SQL-Server itself are not that cooperative with each other to begin with(2), so I wouldn't be surprised if issues arise from even the tiniest differences when trying to port to other technologies, which I faced a few times.
With that in mind, and your evidence of a likely failure in date conversion, plus the chained conversions you mentioned, my first suggestion is to feed the Excel a different date type, and my first choices would be datetime or datetime2, for being the most popular, the most complete, and the most similar to Excel's lonely type.
(1): It's more like zero, it's just an integer with everybody around it giving it special treatment, which they fail at half the time.
(2): Why would int to/from datetime be fine, but int to/from datetime2 is not...
If you make the field of interest a rowfield, and click on the "Filter" triangle in the column heading, then often right to the bottom of the list in that PivotFilter box you'll see the item(s) causing you the problem. Be aware that it might be text that simply looks like a date, or it might be something more obvious as per the basic example in the screenshot below:
As per my comment, another way to diagnose what's going on is by taking just a few of the items that you are sure are dates, putting them into a range, making a PivotTable out of them, and seeing if you can group them. If you can, then you know that the problem is indeed likely some text in the data. If you can't, then it's likely you have text that still needs to be converted to dates...but you'll need to post some examples here in order for us to give you suggestions on how to turn it into something Excel recognizes as a date.

Store a specific hour but in datetime type SQL Server

I have a table that have two columns, one is called HourFrom and HourTo, this hours can be changed. I dont want the type to be nvarchar the only thing that needs to store is for example 08:00 or 23:00.
Is there a way to design the table with functions that only shows the hour instead of creating (for example) a Store Procedure that saves the datetime with this functions?
The reason behind this is because I have an entity in my backend where these two members are Datetime as they should be and dont want to mix types and doing weird castings or using split,indexOf, etc.
You can use time(0), data may look like 03:06:12, 08:45:00. I have no idea how you going to use the data. Eg, if in where clause, where HourFrom between '03:00:00' and '04:00:00', does it matter if HourFrom contains minute and second?

about date in database question

i need to find data between 2 date's and time's.
i use one field for date , and one field for time.
is it be better to use only one field for date & time ?
i see that it came in dd/mm/yyyy hh:mm:ss format that
can contain date and time.
this question is for acceess and for sql-server
thank's in advance
In nearly all circumstances, date and time are needed together. Both Access and SQL server have a date/time data type. In Access, even if you specify the format as time, you can show a date. This is because all datetime data is stored as a number, and time is the decimal portion:
Say I store data: 10:31:46, I can type lines in the immediate window that illustrate the storage of datetime, like so:
?CDec(DlookUp("TimeFormattedField", "Test"))
0.438726851851852
?Year(DlookUp("TimeFormattedField", "Test"))
1899
?Format(dlookup("F4", "Table2"),"dd/mm/yyyy")
30/12/1899
This is because zero (0) is a valid date.
It is very easy to get the different portions of a datetime field, so store datetime as a single fields, because that is what you are going to get, anyway.
I like to store date and time separately. In general, I almost never need time in my apps. One case where I store them separately is in some of my logging routines. This is mostly because I only ever query on dates and never on date+time.
If you need to query on both date and time, then storing them separately is really problematic, because then you have to concatenate two fields for comparison, and that means your criteria won't use any indexes on the two fields. This may not be an issue for a few thousand records, but for anything above that, it can quickly become quite a performance drag. It's also a major issue if you're using a server back end, since all the rows will have to be pulled across the wire, instead of Access/Jet/ACE being able to hand off the selection to the server.
depends on the requirement. If you are using sql server 2008+ then if you store in separate is not a problem, as well as it is as easy option to write the query

Create day, date, time in DB - Question

Create_date -> Assume it will record as 5/2/2009. For search purposes when searching create date can we search by individual month or year from this or do we need to even record a create_day, create_month, crate_year for this? Once of the search filters for user content will be like example -> "Show content from last 2 weeks, last month, last year, current month only"
create_day -> While we record numeric date, many times for we may need to display the text day (ie: Thursday) to show when the object was created. Like we see on social websites "created on Thursday, Jan 29, 2009 at 3:45pm". To get this output, do we need to record a create_day for all objects / activities or is it calculated on each page load at application level?
create_time -> What time are we storing in DB? User local time or a fixed time? This is a global website. If fixed then let's say I do GMT by default. Then next question is how to display the correct time to a user so it matches his local time? Must factor into calculation that in US time changes twice a year (day light savings). OR, maybe record time always in GMT but display it to users based on their detected time zone,but then that means calculating time from timezone on each page load?
And since I am here, a side Q-> Any difference between lookup and reference table? How to distinguish these two tables -> "Account_status" which has values Active, Confirmed, etc... AND another table "City" which has city names. First table is a system ID table used in back end only. The city table is ID table used by users to select city from. Are these both lookup or reference or same/different?
Create_date You'll usually want to use the DMBS underlying date storage type which stores all pieces of a date (and time if need be, like MySQL datetime). This opens up the ability to use date processing functions within the DBMS. For the filter example, you would calculate what the date was at 2 weeks ago. You may also get have a function to specify such a string in the DBMS. PHP's strotime() allows this.
Create_day The text output is either automatic (MySQL datetime for instance) or it's easy to render whichever piece you need. (PHP date() can do this). It's generally a Good Idea to store a time stamp (datetime) for every record.
Create_time If it's a global app you'll want to use UTC/GMT. Every user will have their offset applied to the times and as well as their form submissions (such as searching). PHP's DateTime objects allow specifying a timezone name as the offset. The user can choose from a list but in it's standard form that's a big list and isn't much towards usability if it doesn't even match their city (http://us.php.net/manual/en/timezones.php). The other option is getting the user's computer's timezone via Javascript.
Lookup Table Yes those are both lookup tables.

Informix to Oracle: Dealing with Fetching Null Values

A bit of background first. My company is evaluating whether or not we will migrate our Informix database to Oracle 10g. We have several ESQL/C programs. I've run some through the Oracle Migration workbench and have been muddling through some testing. Now I've come to realize a few things.
First, we have dynamic sql statements that are not handling null values at all. From what I've read, I either have to manually modify the queries to utilize the nvl( ) function or implement indicator variables. Can someone confirm if manual modifications are necessary? The least amount of manual changes we have to make to our converted ESQL/C programs, the better.
Second, we have several queries which pull dates from various tables etc., and in Informix dates are treated as type long, the # of days since Dec 31st, 1899.
In Pro*C, what format is a date being selected as? I know it's not numeric because I tried selecting date field into my long variable and get Oracle error stating "expected NUMBER but got a DATE". So I'm assuming we'd have to modify how we are selecting date fields - either select a date field in a converted manner so it becomes a long (ie, # of days since 12/31/1899), or change the host variable to match what Oracle is returning (what is that, string?).
Ya. You will need to modify your queries as you described.
long is tripping you up. long has a different meaning in Oracle. There is a specific DATE type. Generally when selecting one uses the TO_DATE function with a format, to get the result as a VARCHAR2, in exactly the format you want.
Probably it didn't hit you yet but be aware that in Oracle empty VARCHAR2 fields are NULLs. I see no logic behind this (probably because I came from Informix land) - just keep it in mind. I think it is stupid - IMHO empty string is meaningful and different from NULL.
Either modify all your VARCHAR2 fields to be NOT NULL DEFAULT '-' or any other arbitrary value, or use indicatores in ALL your queries that return VARCHAR2 fields, or always use NVL().
In order to convert the oracle dates (which are store in Oracle internal format) into a long integer, you will need to alter your queries. Use the following formula for your dates:
to_number (to_char (date_column, 'J')) - to_number(to_char(to_date('12/31/1899', 'MM/DD/YYYY'), 'J'))
The Oracle system 'J' (for Julian date) format is a count of number of days since December 31, 4712BC. If you want to count from a later date, you'll need to subtract off the Julian day count of that later date.
One suggestion: instead of altering all of your queries in your programs (which may create problems and introduce bugs), create a set of views in a different schema. These views would be named the same as all the tables, with all the same columns, but include the NVL() and date() formulas (like above). Then point your application at the view schema rather than the base table schema. Much less testing and fewer places to missing something.
So for example, put all your tables into a schema called "APPS_BASE" (defined by the user "APPS_BASE". Then create another schema/user called "APPS_VIEWS". In the APPS_VIEWS create a view:
CREATE OR REPLACE VIEW EMP AS
SELECT name, birth_date
FROM APPS_BASE.EMP;

Resources