How many address fields would you use for a UK database? - database

Address records are probably used in most database, but I've seen a number of slightly different sets of fields used to store them. The number of fields seems to vary from 3-7, and sometimes all fields are simple labelled address1..addressN, other times given specific meaning (town, city, etc).
This is UK specific, though I'm open to comments about the rest of the world too. Here you need the first line of the address (actually just the number) and the post code to identify the address - everything else is mostly an added bonus.
I'm currently favouring:
Address 1
Address 2
Address 3
Town
County
Post Code
We could add Country if we ever needed it (unlikely).
What do you think? Is this too little, too much?

The Post Office suggests (http://www.postoffice.co.uk/portal/po/content1?catId=19100182&mediaId=19100267) 7 lines:
Addressees Name
Company/Organisation
Building Name
Number of building and name of thoroughfare
Locality Name
Post Town
Post Code
They then say you do not need to include a County name provided the Post Town and Postcode are used.

The BSI have BS 7666 - that covers all addressing. I recommend you look there.
The 2000 version recommends
An address shall be based upon a logical data model comprising the following entities:
addressable object, with sub-types:
primary addressable object;
secondary addressable object;
street;
locality;
town;
administrative area, a.k.a. district;
county;
postcode.
See: http://landregistry.data.gov.uk/def/common/BS7666Address

I don't know whether this is minimal (I doubt it) but the heading on my cheque book says something pretty close to:
Lloyds TSB
Isle of Man Offshore Centre
Peveril Buildings
Peveril Square
Douglas
Isle of Man
IM99 0XX
United Kingdom
This causes fits when I try to enter it into the US banking system.

If I were you, I'd call Royal Mail and ask them... or look on their website for postcode lookup as a best practice.
There's different types of addresses, and each different type has a slightly different structure. Forward sorting offices have a different postal address structure than a residential home with a street number. What if the house has a name instead of a number? There are so many factors to consider.
Since I moved to Canada I had to do something similar and it's far more complicated than a straightforward residential address which generally has:
Street Number if applicable
Street Number Suffix if applicable
House Name
Street Name
Street Type
Street Direction if applicable
Unit Number for flats, townhouses or other types of building/location
Minor Municipality (Village)
Major Municipality (Major Town/City)
County
PostCode
Country if you include Scotland, Wales, Northern Ireland (and now I noticed Eire)
Then you get businesses that have their own Delivery Route, PO Boxes, Forward Sortation Offices...
It gets complicated in a real hurry.
Best bet - give Royal Mail a call and they should be able to give you information on their standard address templates.
EDIT: Your 3 field method isn't a bad one...particularly. However, data sanitization may be a significant issue using the field setup you have and you may need a fairly complex strategy for making sure that the address entered is valid. It's far easier to sanitize single dedicated fields to make sure input is correct than it is to parse various address tokens out of combined fields.
Another simpler way to gain this info is to go on the Royal Mail website and check their postcode lookup page.
On their main postcode lookup, they use 4 fields and I guess they have some form of validation on the street name/type field. They separate the house number and name and I guess they only allow major municipality. I'm assuming the county/country are assumed. If you break out their advanced search, they give you two extra fields for flat number and business name.
Given that some fields are combined on their site, you have to assume that there's some amount of validation to make sure that data entered can be gainfully used.

Premises elements
Sub Building Name
Building Name
Building Number
Organisation Name
Department Name
PO Box Number
Thoroughfare elements
Dependent Thoroughfare Name
Dependent Thoroughfare Descriptor
Thoroughfare Name
Thoroughfare Descriptor
Locality elements
Double Dependent Locality
Dependent Locality
Post Town
Postcode element
Postcode
This answer may be a few years late, but it's aimed at those like myself looking for guidance on how to correctly format postal addresses for both storing in a database (or the likes of it) and for printing purposes.
Taken from Royal Mail Doc, link below - conveniently titled the 'Programmers Guide'
Technical specififcation for users of PAF
Page 27 - 42 was most helpful for me.

It's very likely that a "UK" will be opened to Eire as well, and in some lines of business there will be legal differences, generally between Scotland / NI / the channel islands and England and Wales.
In short, I would add country to the list. Otherwise it's fine (no fewer certainly), though of course any address is traceable from a building reference, a post code and a country alone.

Where we live in France its just 3 lines:-
myname
village/location name
6 digit postcode followed by post town name in uppercase
Even from UK that's all that is required

Related

React library for physical address localization

My team wants to implement a feature for physical addresses to be localized from navigator.language, for example, in the US this is a physical address:
919 Stimple Ct, Fairbanks, Alaska 99712, USA
in other countries like Italy, the address is:
Rossi Gianni, VIA GARIBALDI 26, 70043 MONOPOLI BA, ITALIA
and each country or region have a different format for physical address.
The thing is the address display is based on US physical address for the street number and name, the state, the city, and then is displayed in another screen. But for other countries is different, so the street number and name depends on how the user writes them, but the other fields for state, city and Postal Code will show with the US address format. Changing the input fields will do the same, changing the variables for each input means creating code for each language (not an option).
I've been searching for a library for React to format physical address but all I've found are libraries to parse the address in an object, but it doesn't format the address by location.
So I'm looking for a master who may know a library (or maybe a license to pay for) to format physical address by location.
We may use only 5 address format by continent if there is no library to localize them.
Thanks !

Multiple language (but same value) count in Laravel

My Goal: To find out which University has the larger amount of user (DISTINCT and COUNT in MySQLi).
I've been developing a survey website for Portugal, England and France.
In the survey some questions answer has predefined answer options.
For example: Gender, Living Country, Graduation Level (undergraduate, graduate, PhD, BBA etc)
But I also have questions where users need to write down the answers.
For example, University Name (where the user studied).
Two users filled the form as follow:
In this case the text "University of Glasgow" in English and the text "Universidade of Glasgow" in Portuguese is difference but it's the same institute.
So, these two institute has one user but the truth is this (as both are originally same University) University has two users.
My Question: How can I get the expected result?
I was planning to use Google translate but I it won't be accurate.
I also thought about to have all the University name in 3 languages but there are more than thousands of University, so it may not be efficient.
The structre I thought for table is,
survey_table
id, que_en, que_fr, que_pt, university_name
statistics_table
id, university_name, count
You could use localization for the university name. Check the docs here:
https://laravel.com/docs/5.7/localization
Make your users choose from a drop down list based on their locale (language)

Get the region name in maxmind database

I have downloaded a database cities
`Country` `City` `AccentCity` `Region` `Population` `Latitude` `Longitude`
af amir kalay Amir Kalay 16 0 34.6333 70.3333
ad aixas Aixas 06 0 42.4833 1.4667
and lot more records
I have downloaded another database called fips_10_4 to show the state of the city
country,Region,State
AD,02,"Canillo"
AD,03,"Encamp"
AD,04,"La Massana"
AD,05,"Ordino"
AD,06,"Sant Julia de Loria"
AD,07,"Andorra la Vella"
AD,08,"Escaldes-Engordany"
AE,01,"Abu Dhabi"
Now if you are thinking that Iam asking for some sql query then you are wrong.
Everything was working fine but then I came to know that the file i downloaded from
Maxmind website is incomplete as 'fips_10_4' has no record for country 'af' and region '16' .May anybody help me to deal this problem and tell me the correct place to download this complete file
FIPS 10-4 has changed. The list of changes can be found here.
In particular, AF16 (Laghman) has changed to AF35. MaxMind uses the new list.
If you need both the old and the new codes, you can find them here. You can parse the contents of the file, and replace your database table with the information found there.
AF is the two digit ISO code (IS0-3166-2) for Afghanistan, which ISO are currently trying to sell for the frankly astonishing sum of CHF 244 (Swiss Francs).
As Teleo says FIPS 10-4 has changed as detailed on the ITL website and the link Teleo has given provides the data in a more usable format. MaxMind also provides it in a better format.
I would be extremely wary about using this. Both MaxMind & Teleo's link are being provided, for free, by an external company/person that has no particular interest in keeping their data up-to-date. I notice, for instance, that the following countries are missing:
South Sudan
Sint Martaan (Dutch Part)
Bonaire, Sint Eustatius and Saba
Curaçao
The last three were part of the Netherlands Antilles, which was dissolved on 10th October 2010. Incidentally, the Netherlands Antilles, which hasn't existed for a year and a half, is still on this file.
The reason for all of this? FIPS 10-4 was withdrawn almost a decade ago on 8th September 2002. To quote the ITS website (my emphasis):
“For a replacement to FIPS 10-4, INCITS L1 is coordinating with other
standards developers and interested parties to determine whether
processing a draft proposed American National Standard or adopting an
ISO standard would be the better way forward. For more information on
the status of this activity, contact Rick Pearsall
(Richard.A.Pearsall#nga.mil).”
A quick Google brings the news the INCITS L1 is next meeting on the 12th June 2012. I wouldn't hold your breath.
Another reason not to use FIPS is that it is unlikely to be used much outside of the USA (obviously some people will still use it). While this may not matter immediately I would future proof your systems as a matter of course.
I would highly recommend using the ISO 3166 standard. It is a globally recognised way of categorising country data.
The CommonDataHub maintains a great version, which includes country and state in the same manner as FIPS 10-4. They also have other ISO states databases, which are more normalised and worth investigating.
It also maintains a list of all cities with a population greater than 5,000.
ISO maintain a copy on their website of the 3166-2 standard, which will take a bit of coding to ensure it's you're always updated at least you'll be sure it's correct. Wikipedia is also surprisingly good at keeping up-to-date. It beat CommonDataHub by a month when South Sudan was created, due to problems telling people that the data existed.
There are other places out there where this data exists, this just details what I use.
If you want to avoid databases all-together then the Yahoo! PlaceFinder API is a good place to start. It has some documented problems keeping up-to-date but at least there's a place where you can tell them they've got it wrong.
tl;dr
Don't use FIPS, it was withdrawn a decade ago. Use the globally recognised ISO standard instead.
I am not sure what is your true goal, but here is a great resource of countries and cities and all...

What is the most effective way to handle lots of tables in a database?

I am new to database programming and am using sqlite and python. As an example lets say I have a database named Animals.db which I open with and get the cursor for in python. Now if I wanted to separate the animals by species I would have a different table per species and since it can get even more specific I would likely need something more specific than just a table of species.
I am a bit confused on how one allocates the correct data to the correct area of a database, how is it separated. Are there tables of tables?
if I wanted to lets say have a table for every land animal and another for every animal of the sea, but each table would need further specification(homo sapiens, etc), how can I do that?
Now if I wanted to separate the
animals by species I would have a
different table per species
Maybe. Maybe not. You might use a table that looked like this. It depends entirely on what you mean by "separate the animals by species". Here's one reasonable interpretation.
Animal_name Sex Species
------
Jack M Leopardus pardalis
Susie F Leopardus pardalis
Kimmie M Leopardus pardalis
Susie F Stenella clymene
Ginger F Stenella clymene
Mary Ann F Stenella clymene
To find all the Clymene dolphins, you might use a query along these lines.
select Animal_name
from animals
where species = 'Stenella clymene'
order by Animal_name
Animal_name
--
Ginger
Mary Ann
Susie
Start by collecting data. Your goal is to collect a set of representative sample data. Sample data, because the full population is too big to handle. Representative, because ideally it represents all the problems you're likely to run into with the full population. If "animal name" to you doesn't mean "Jack" or "Ginger", but "ocelot" and "Clymene dolphin", representative sample data will make that clear.

List of standard lengths for database fields

I'm designing a database table and asking myself this question: How long should the firstname field be?
Does anyone have a list of reasonable lengths for the most common fields, such as first name, last name, and email address?
I just queried my database with millions of customers in the USA.
The maximum first name length was 46. I go with 50. (Of course, only 500 of those were over 25, and they were all cases where data imports resulted in extra junk winding up in that field.)
Last name was similar to first name.
Email addresses maxed out at 62
characters. Most of the longer ones
were actually lists of email
addresses separated by semicolons.
Street address maxes out at 95
characters. The long ones were all
valid.
Max city length was 35.
This should be a decent statistical spread for people in the US. If you have localization to consider, the numbers could vary significantly.
UK Government Data Standards Catalogue details the UK standards for this kind of thing.
It suggests 35 characters for each of Given Name and Family Name, or 70 characters for a single field to hold the Full Name, and 255 characters for an email address. Amongst other things..
W3C's recommendation:
If designing a form or database that will accept names from people
with a variety of backgrounds, you should ask yourself whether you
really need to have separate fields for given name and family name.
… Bear in mind that names in some cultures can be quite a lot longer
than your own. … Avoid limiting the field size for names in your
database. In particular, do not assume that a four-character
Japanese name in UTF-8 will fit in four bytes – you are likely to
actually need 12.
https://www.w3.org/International/questions/qa-personal-names
For database fields, VARCHAR(255) is a safe default choice, unless you can actually come up with a good reason to use something else. For typical web applications, performance won't be a problem. Don't prematurely optimize.
Some almost-certainly correct column lengths
Min Max
Hostname 1 255
Domain Name 4 253
Email Address 7 254
Email Address [1] 3 254
Telephone Number 10 15
Telephone Number [2] 3 26
HTTP(S) URL w domain name 11 2083
URL [3] 6 2083
Postal Code [4] 2 11
IP Address (incl ipv6) 7 45
Longitude numeric 9,6
Latitude numeric 8,6
Money[5] numeric 19,4
[1] Allow local domains or TLD-only domains
[2] Allow short numbers like 911 and extensions like 16045551212x12345
[3] Allow local domains, tv:// scheme
[4] http://en.wikipedia.org/wiki/List_of_postal_codes. Use max 12 if storing dash or space
[5] http://stackoverflow.com/questions/224462/storing-money-in-a-decimal-column-what-precision-and-scale
A long rant on personal names
A personal name is either a Polynym (a name with multiple sortable components), a Mononym (a name with only one component), or a Pictonym (a name represented by a picture - this exists due to people like Prince).
A person can have multiple names, playing roles, such as LEGAL, MARITAL, MAIDEN, PREFERRED, SOBRIQUET, PSEUDONYM, etc. You might have business rules, such as "a person can only have one legal name at a time, but multiple pseudonyms at a time".
Some examples:
names: [
{
type:"POLYNYM",
role:"LEGAL",
given:"George",
middle:"Herman",
moniker:"Babe",
surname:"Ruth",
generation:"JUNIOR"
},
{
type:"MONONYM",
role:"SOBRIQUET",
mononym:"The Bambino" /* mononyms can be more than one word, but only one component */
},
{
type:"MONONYM",
role:"SOBRIQUET",
mononym:"The Sultan of Swat"
}
]
or
names: [
{
type:"POLYNYM",
role:"PREFERRED",
given:"Malcolm",
surname:"X"
},
{
type:"POLYNYM",
role:"BIRTH",
given:"Malcolm",
surname:"Little"
},
{
type:"POLYNYM",
role:"LEGAL",
given:"Malik",
surname:"El-Shabazz"
}
]
or
names:[
{
type:"POLYNYM",
role:"LEGAL",
given:"Prince",
middle:"Rogers",
surname:"Nelson"
},
{
type:"MONONYM",
role:"SOBRIQUET",
mononym:"Prince"
},
{
type:"PICTONYM",
role:"LEGAL",
url:"http://upload.wikimedia.org/wikipedia/en/thumb/a/af/Prince_logo.svg/130px-Prince_logo.svg.png"
}
]
or
names:[
{
type:"POLYNYM",
role:"LEGAL",
given:"Juan Pablo",
surname:"Fernández de Calderón",
secondarySurname:"García-Iglesias" /* hispanic people often have two surnames. it can be impolite to use the wrong one. Portuguese and Spaniards differ as to which surname is important */
}
]
Given names, middle names, surnames can be multiple words such as "Billy Bob" Thornton, or Ralph "Vaughn Williams".
I would say to err on the high side. Since you'll probably be using varchar, any extra space you allow won't actually use up any extra space unless somebody needs it. I would say for names (first or last), go at least 50 chars, and for email address, make it at least 128. There are some really long email addresses out there.
Another thing I like to do is go to Lipsum.com and ask it to generate some text. That way you can get a good idea of just what 100 bytes looks like.
I pretty much always use a power of 2 unless there is a good reason not to, such as a customer facing interface where some other number has special meaning to the customer.
If you stick to powers of 2 it keeps you within a limited set of common sizes, which itself is a good thing, and it makes it easier to guess the size of unknown objects you may encounter. I see a fair number of other people doing this, and there is something aesthetically pleasing about it. It generally gives me a good feeling when I see this, it means the designer was thinking like an engineer or mathematician. Though I'd probably be concerned if only prime numbers were used. :)
These might be useful to someone;
youtube max channel length = 20
facebook max name length = 50
twitter max handle length = 15
email max length = 255
http://www.interoadvisory.com/2015/08/6-areas-inside-of-linkedin-with-character-limits/
I wanted to find the same and the UK Government Data Standards mentioned in the accepted answer sounded ideal. However none of these seemed to exist any more - after an extended search I found it in an archive here: http://webarchive.nationalarchives.gov.uk/+/http://www.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/datastandards.aspx. Need to download the zip, extract it and then open default.htm in the html folder.
+------------+---------------+---------------------------------+
| Field | Length (Char) | Description |
+------------+---------------+---------------------------------+
|firstname | 35 | |
|lastname | 35 | |
|email | 255 | |
|url | 60+ | According to server and browser |
|city | 45 | |
|address | 90 | |
+------------+---------------+---------------------------------+
Edit: Added some spacing
Just looking though my email archives, there are a number of pretty long "first" names (of course what is meant by first is variable by culture). One example is Krishnamurthy - which is 13 letters long. A good guess might be 20 to 25 letters based on this. Email should be much longer since you might have firstname.lastname#somedomain.com. Also, gmail and some other mail programs allow you to use firstname.lastname+sometag#somedomain.com where "sometag" is anything you want to put there so that you can use it to sort incoming emails. I frequently run into web forms that don't allow me to put in my full email address without considering any tags. So, if you need a fixed email field maybe something like 25.25+15#20.3 in characters for a total of 90 characters (if I did my math right!).
I usually go with:
Firstname: 30 chars
Lastname: 30 chars
Email: 50 chars
Address: 200 chars
If I am concerned about long fields for the names, I might sometimes go with 50 for the name fields too, since storage space is rarely an issue these days.
If you need to consider localisation (for those of us outside the US!) and it's possible in your environment, I'd suggest:
Define data types for each component of the name - NOTE: some cultures have more than two names! Then have a type for the full name,
Then localisation becomes simple (as far as names are concerned).
The same applies to addresses, BTW - different formats!
it is varchar right? So it then doesn't matter if you use 50 or 25, better be safe and use 50, that said I believe the longest I have seen is about 19 or so. Last names are longer

Resources