Why won't " (800) 555-1212" fit in a varchar(20) field? - sql-server

I have a SQL Server phone number field that is defined as varchar(20). A phone number like '800-555-1212' fits with no issue but a phone number like '(800) 555-1212' will cause a "data will be truncated" error. Why would that be?

That example inserts fine into a field defined that way (I ran a test).
You may have some invisible characters adding to the length like tabs and carriage returns.
Or you may have a calculated field (or code in a trigger) that is based on that field (we have a phone_number_stripped field to store just the numerics) that has accounted for stripping out - but not ( or ).

A value could be "like" that value, but be padded with spaces on either end.
Make sure your software handles that case. I recommend allowing the space, but automatically trimming it off.

Related

SQL Server validating postcodes

I have a table containing postcodes but there is no validation built in to the entry form so there is no consistency in the way they are stored in the database, sample below:
ID Postcode
001742 B5
001745
001746
001748 DY3
001750
001751
001768 B276LL
001774 B339HY
001776 B339QY
001780 WR51DD
I want to use these postcode to map the distance from a central point but before I can do that I need to put them into a valid format and filter out any blanks or incomplete postcodes.
I had considered using
left(postcode,3) + ' ' + right(postcode,3)
To correct the formatting but this wouldn't work for postcodes like 'M6 8HD'
My aim is to get the list of postcodes in a valid format but I don't know how to account for different lengths of postcode. Is this there a way to do this in SQL Server?
As discussed in the comments, sometimes looking at a problem the other way around presents a far simpler solution.
You have a list of arbitrary input provided by users, which frequently doesn't contain the correct spacing. You also have a list of valid postcodes which are correctly spaced.
You're trying to solve the problem of finding the correct place to insert spaces into your arbitrary inputs to make them match the list of valid codes, and this is extremely difficult to do in practice.
However, performing the opposite task - removing the spaces from the valid postcodes - is remarkably easy to do. So that is what I'd suggest doing.
In our most recent round of data modelling, we have modelled addresses with two postcode columns - PostCode containing the postcode as provided from whatever sources, and PostCodeNoSpace, a computed column which strips whitespace characters from PostCode. We use the latter column for e.g. searches based on user input. You may want to do something similar with your list of Valid postcodes, if you're keeping it around permanently - so that you can perform easy matches/lookups and then translate those matches back into a version that has spaces - which is actually a solution to the original question posed!

Word wrap issues with SSIS Flat file destination

Background: I need to generate a text file with 5 records each of 1565 character length. This text file is further used to feed the data to a software.
Hence, they are some required fields and optional fields. I created a query with all the fields added together to get one single field. I populated optional fields with a blank.
For example:
Here is the sample input layout for each fields
Field CharLength Required
ID 7 Yes
Name 15 Yes
Address 15 No
DOB 10 Yes
Age 1 No
Information 200 No
IDNumber 13 Yes
and then i generated a query for each unique ID with the above fields into a single row which looks like following:
> SELECT Cast(1 AS CHAR(7))+CAST('XYZ' AS CHAR(15))+CAST('' AS CHAR(15))+CAST('22/12/2014' AS
CHAR(10))+CAST('' AS CHAR(1))+CAST(' AS CHAR(200))+CAST('123456' AS CHAR(13))
UNION
SELECT Cast(2 AS CHAR(7))+CAST('XYZ' AS CHAR(15))+CAST('' AS CHAR(15))+CAST('22/12/2014' AS
CHAR(10))+CAST('' AS CHAR(1))+CAST(''AS CHAR(200))+CAST('123456' AS CHAR(13))
Then, I created an SSIS package to produce the output text file through Flat file destination delimited.
Problem:
Even though the flat file is generated as per the desired length(1565). The text file looks differently when the word wrap is ON or OFF. When Word wrap is off , i get the record in single line. If the Word wrap is on, the line is broken into multiple. the length of the record in either case is same.
Even i tried to use VARCHAR + Space in the query instead of CHAR for each field, but there is no success. Its breaking the line for blank fields.
For example: Cast('' as varchar(1)) + Space(200-len(Cast('' as varchar(1)))) for Information field
Question: How do make it into a single line even though the word wrap is ON.
Since its my first post, please excuse me for format of the question
The purpose of word wrap is to put characters on the next line in instances of overflow rather than creating an extremely horizontal scrolling document.
Word wrap is the additional feature of most text editors, word processors, and web browsers, of breaking lines between words rather than within words, when possible.
Because this is what word wrap is there's nothing you can do to change its behavior. What does it matter anyway? The document should still be parsed as you would expect. Just don't turn word wrap on.
As far as I'm aware, having word wrap on or off has no impact on the document itself, it's simply a presentation option.
Applications parsing a document parse it as if word wrap were off. Something that could throw off parsing is breaks for a new line, but that is a completely different thing from word wrap.

Using text in a column of Date/Time type in access

I have a column in MS Access in which the data could be any of the following:
A date
Text string: "n/a"
Text string: "n/e"
The vast majority of entries will be dates but a very few will need to be these specified text strings. I would like to still be able to perform date calculations on the column. Whats the best datatype to use?
In my opinion the best approach would be to leave the date field as Date/Time and then add another field to indicate the status if the Date/Time field is Null. Something like:
DateField DateStatus
--------- ----------
2014-09-21
n/a
2014-09-23
2014-09-25
n/e
You could use a single Text field, but then any time you wanted to use the field value as a proper Date/Time value you'd have to convert it using CDate(). You would also have the possibility of other junk getting in there, or dates getting entered in different formats (e.g. d/m/yyyy vs. m/d/yyyy). And finally, you would lose the ability to easily determine whether a Date/Time value is in a particular row (which in my approach would simply be ... WHERE DateField IS [NOT] NULL).
I agree with Gord Thompson's answer - mainly because it's so non-intuitive to have, essentially, two completely different types of data in a single column, and because it's going to make validation/data integrity stuff so much harder with little upside - and, as he indicates with the CDate() reference, dates basically only work reliably like dates if they're in a "date/time" field. Microsoft has a page on choosing a data type that explains some of the Access-specific differences in more detail.
I also suggest that you don't actually have a text field for those "comments," since you say there's only a handful of potential options - use a Long Integer and connect back to a separate table with the list of allowable entries. This will allow you to run reports more easily, change the "display text" in one step instead of potentially dozens of times, etc. It also saves a relatively small amount of space per record (long integer = 4 bytes; text = up to 255 bytes.)
You can also do fun data/reporting stuff with that Comment (long integer) field and dates - even combined into ranges, by the way - queries let you use the two different columns to create a single answer. I have a report that's grouped so that you can see stats for everything that's active (by quarter in which they start) plus everything that's pending (with the code indicating who's responsible for watching this record,) plus everything that's not pending but still doesn't have a start date (with the reason code displayed,) plus everything that's expired (by quarter in which they ended.) It looks like each of those things is in a single column in the report, but it's actually like five columns that have been concatenated with the IIf function.
(Almost every argument I can come up with boils down to "this is what relational databases are all about and why they're so awesome.)

TClientDataset Widestring field doubles in size after reading NVARCHAR from database

I'm converting one of our Delphi 7 projects to Delphi X3 because we want to support Unicode. We're using MS SQL Server 2008/R2 as our database server. After changing some database fields from VARCHAR to NVARCHAR (and the fields in the accompanying ClientDatasets to ftWideString), random crashes started to occur. While debugging I noticed some unexpected behaviour by the TClientDataset/DbExpress:
For a NVARCHAR(10) databasecolumn I manually create a TWideStringField in a clientdataset and set the 'Size' property to 10. The 'DataSize' property of the field tells me 22 bytes are needed, which is expected since TWideStringField's encoding is UTF-16, so it needs two bytes per character and some space for storing the length. Now when I call 'CreateDataset' on the ClientDataset and write the dataset to XML (using .SaveToFile), in the XML file the field is defined as
<FIELD WIDTH="20" fieldtype="string.uni" attrname="TEST"/>
which looks ok to me.
Now, instead of calling .CreateDataset I call .Open on the TClientDataset so that it gets its data through the linked components ->TDatasetProvider->TSQLDataset (.CommandText = a simple select * from table)->TSQLConnection. When I inspect the properties of the field in my watch list, Size is still 10, Datasize is still 22. After saving to XML file however, the field is defined as
<FIELD WIDTH="40" fieldtype="string.uni" attrname="TEST"/>
..the width has doubled?
Finally, if I call .Open on the TClientDataset without creating any fielddefinitions in advance at all, the Size of the field will afterwards be 20(incorrect !) and Datasize 42. After saving to XML, the field is still defined as
<FIELD WIDTH="40" fieldtype="string.uni" attrname="TEST"/>
Does anyone have any idea what is going wrong here?
Check the fieldtype and it's size at the SQLCommand component (which is before DatasetProvider).
Size doubling may be a result of two implicit "conversions": first - server provides NVarchar data which is stored into ansi-string field (and every byte becomes a separate character), second - it is stored into clientdataset's field of type Widestring and each character becomes 2 bytes (size doubles).
Note that in prior versions of Delphi string field size mismatch between ClientDataset's field and corresponding Query/Command field did not result in an exception but starting from one of XE*'s it offten results in AV. So you have to check carefully string field sizes during migration.
Sounds like because of the column datatype being changed, it has created unexpected issues for you. My suggestion is to
1. back up the table,multiple ways to doing this,pick your poison figuratively speaking
2. delete the table,
3. recreate the table,
4. import the data from the old table to the newly created table. See if that helps.
Sql tables DO NOT like it when column datatypes get changed, and unexpected issues may arise from doing just that. So try that, and worst case scenario, you have wasted maybe ten minutes of your time trying a possible solution.

Twitter name length in DB

I'm adding a field to a member table for twitter names for members on a site. From what I can work out the maximum twitter name length is 20 so it seems obvious that I should set the field size to varchar(20) (SQL Server).
Is this a good idea?
What if Twitter starts allowing multi-byte characters in the user names? Should I make this field nvarchar?
What if Twitter decides to increase the size of a username? Should I make it 50 instead and then warn a user if they enter a name longer than 20?
I'm trying to code defensively so that I can reduce the chances of modifying the code around this input field and the DB schema changes that might be needed.
while looking for the same info i found the following in a sort of weird place in the twitter help section (why not in the API docs? who knows?):
"Your user name can contain up to 15 characters. Why no more? Because we append your user name to your 140 characters on outgoing SMS updates and IM messages. If your name is longer than 15 characters, your message would be too long to send in a single text message."
http://help.twitter.com/entries/14609-how-to-change-your-username
so perhaps one could even get away with varchar(16)
While new accounts has a limit of 15 characters in the username and 20 characters in the name, for old accounts this limit seems to be undefined. The documentation here states:
Earlybirds: Early users of Twitter may have a username or real name longer than user names we currently allow. This is ok until you need to save changes to your account settings. No changes will save unless your user/real name is the appropriate length; this means you have to change your real name/username to meet our most modern regulations.
So you are probably better of having a long field and save yourself some time when you hit the border cases.
Nowadays, space is usually not a concern, so I'd use a mostly generic approach: use nvarchar(200).
When designing DB schemas you must think 2 steps ahead, even more than when programming. Or get yourself a good schema update strategy, then you'll be fine also with varchar(20).
Personally I wouldn't worry. Use something like 200 (or a nice round number like 256) and you won't have this problem. The limit then is on their API, so you might be best to do some verification that it is a real username anyway. That verification implicitly includes the length checking.
Twitter allows for 140 characters to be typed in as the message payload for transmission, and includes "[username]:" at the beginning of the SMS message. With an upper limit of 140 characters for the message combined with the messaging system being based on SMS, I think they would have to decrease the allowable message size to increase the username. I think it is a pretty safe bet that 20 characters would be the max username length. I'd use nvarchar just in case someone uses 16-bit characters, and maybe pad it a little. nvarchar(24) should work; I wouldn't go any higher than nvarchar(32).
If you're going to develop an app for their service, you should probably watch the messages on Twitter's API Announcements mailing list.
[opinion only]
Twitter works on SMS and the limit there is something like 256 characters, so the name has to be small to avoid hitting into the message.
nvarchar would be a good idea for all twitter text
If the real ID of a Twitterer is a cell-phone then the longest phone number is your max - 20 should easily cover it!
Defensive programming is always good :) !
[/opinion only]
There's only so much you can code defensively, I'd suggest looking at the twitter API documentation and following anything specified there. That said, from a cursory look through nowhere seems to specify the length of the username, annoyingly :/
One thing to keep in mind here is that a field using nvarchar needs twice as much space, since it needs 2 bytes to store each potential unicode character. So, a twitter status would need a size of 280 using nvarchar, PLUS some more for possible retweets, as those aren't inlcuded in the 140 char limit. I discovered this just today in fact!
For example:
RT #chatrbyte: here's some great tweet
that I'm retweeting.
The RT #chatrbyte: is not included in the 140 character limit.
So, assuming that a Twitter username has a 20 character limit, and wanting to also capture a ReTweet, a field to hold a full tweet would need to be a nvarchar of size 280 + 40 (for the username) + 8 (for the initial RT # before a retweet) +4 (for the :+space after a Retweet username) = 330.
I would say go for nvarchar(350) to give yourself a little room. That's what I am trying right now. If I'm wrong I'll update here.
I'm guessing you are managing the data entry on the Twitter name field in your application somewhere other than just in the database. If you open the field to 200 characters, you only have to change the code in one place or if you allow users to enter Twitters names with more than 20 characters, you don't have to worry about a change at all.

Resources