Ordering by the first alphabetical char in a column in MYSQL - database

A table in a MYSQL database has address details- eg...
add1, add2, add3, district, postalTown, country
Ordering by postal town is usually fine, but some details have numbers in the postalTown column. For example 1420 Territet or 3100 Overijse. This will mean these will appear at the top above Aberdeen or Bristol. Is there a way of ordering by postalTown but by the first alphabetical character? That would mean the order of the above would be- Aberdeen, Bristol, Overijse, Territet
Thanks

Write an expression that will return the first alphabetical character, then just Order By [that expression]
Order By substring(LTrim(
Replace(Replace(Replace(Replace(Replace(
Replace(Replace(Replace(Replace(Replace(
colname, '1', ''),'2',''),'3',''),'4,''),'5', ''),
'6',''),'7',''),'8',''),'9',''),'0',''))
1,1)
If you want the rows sorted by the entire city name, and not just by the first character (as question title specifies) then use this:
Order By LTrim(
Replace(Replace(Replace(Replace(Replace(
Replace(Replace(Replace(Replace(Replace(
colname, '1', ''),'2',''),'3',''),'4,''),'5', ''),
'6',''),'7',''),'8',''),'9',''),'0',''))
Above is a guess (I haven't tried it), but the idea is first delete all numeric characters from the column value, then take the first character of whatever is remaining.
Also, if this works, and if you have any development access to the dataabse, (thinking DRY principle), I would add a computed column to this table, (or a separate view against the table), that is defined to use the above expression, so that this "extraction" of the town name is available to all other code that might want to access it without copying this expression everywhere you may need it..

You could write a stored function which returns the remainder of the column starting at the first alphabetic character (perhaps using REGEXP to find that index). Then order by the stored function.
Edit: instead of regexp in your function, depending on data format you could do a 'substring_index' on ' ' (space) and return the index of the first space, then call substring to return the remainder of the string after the first space.
Once you've created a stored function to return the string following the numbers, you can utilize it like this:
order by yourfunctionname(postalTown)
Stored Functions

First thing that comes to mind to me would to do the following on my ORDER BY, obviously, adding numbers 0 through 9. You'll notice crappy schemas produce crappy solutions. :) As the gentleman said above, you should probably think about a redesign of how you are storing your town data.
ORDER BY REPLACE(REPLACE(REPLACE(FieldName, '1', ''),'2',''),'3','') ETC.

Create a view on the table, making whatever translations you need, and then query against the view?

Related

Full text index doesn't work at single word?

I have a full text index on many columns of the customer table, one of which columns is fname.
The following query:
select * from customer where fname like 'In%' and code='1409584557891'
returns me the line needed, this customer has an fname of 'In' .But if I add this to the end:
and contains((customer.fname) , N'"In*"')
an empty result-set is retuned. Why?
Also: there is another column named lname. If I add the equivelant contains command with the column and its value altered, it works!
There is a good chance "In" is a noise word. I also believe that if you do a fulltextsearch for something too short like the letter 'a' it is simply considered a noise word. See if 'a' or 'I' gives you anything.
Here is a link that can provide information on changing the noise words around if that is the case.
https://www.mssqltips.com/sqlservertip/1491/sql-server-full-text-search-noise-words-and-thesaurus-configurations/
You may also be able to simply turn off noise or 'stop' words:
https://dba.stackexchange.com/questions/135062/sql-server-no-search-results-caused-by-noise-words

How to take apart information between hyphens in SQL Server

How would I take apart a column that contains string:
92873-987dsfkj80-2002-04-11
20392-208kj48384-2008-01-04
Data would look like this:
Filename Yes/No Key
Abidabo Yes 92873-987dsfkj80-2002-04-11
Bibiboo No 20392-208kj48384-2008-01-04
Want it to look like this:
Filename Yes/No Key
Abidabo Yes 92873-987dsfkj80-20020411
Bibiboo No 20392-208kj48384-20080104
whereby I would like to concat the dates in the end as 20020411 and 20080104. From the right side, the information is the same always. From the left it is not, otherwise I could have concatenated it. It is not an import issue.
As mentioned in the comments already, storing data like this is a bad idea. However, you can obtain the dates from those strings by using a RIGHT function like so:
SELECT RIGHT('20392-208kj48384-2008-01-04', 10)
Output:
2008-01-04
Depending on the SQLSERVER version you are using, you can use STRING_SPLIT which requieres COMPATIBILITY_LEVEL 130. You can also build your own User Defined Function to split the contents of a field and manipulate it as you need, you can find some useful examples of SPLIT functions in this thread:
Split function equivalent in T-SQL?
Assuming I'm correct and the date part is always on the right side of the string, you can simply use RIGHT and CAST to get the date (assuming, again, that the date is represented as yyyy-mm-dd):
SELECT CAST(RIGHT(YourColumn, 10) As Date)
FROM YourTable
However, Panagiotis is correct in his comment - You shouldn't store data like that. Each column in the database should hold only a single point of data, be it string, number or date.
Update following your comment and the updated question:
SELECT LEFT(YourColumn, LEN(YourColumn) - 10) + REPLACE(RIGHT(YourColumn, 10), '-', '')
FROM YourTable
will return the desired results.

What string value will sort after text when using ORDER BY on varchar field

I'm dealing with a table that includes a list of names (type VARCHAR). Some of the names in the list will be replaced with some text due to confidentiality reasons. For one of our reports, we're running the following query (names have been replaced to protect the innocent).
SELECT * from PeopleData
WHERE [location] = 'Somewhere'
Order By [name] ASC
I'm not looking for a new select statement, but a possible value for a [name] cell that will cause the row to be sorted to the bottom of the list when this query is performed. If I use an empty string, it sorts to the top of the list. If I use a space, it sorts to the top of the list. I've even tried using a | character, since it's ASCII value is higher than any text, and it still sorts to the top of the list.
EDIT: The other criteria is that it can't be terribly obvious that the names were removed, since this list data is public. That means no 'ZZZZZZZZZZZ', and no '** CONFIDENTIAL **' values. Looking for something that's not screaming "TOP SECRET".
Any suggestions?
You could use any value you like and add to your ORDER BY statement to force the ordering:
SELECT * from PeopleData
WHERE [location] = 'Somewhere'
ORDER BY CASE WHEN [name] = 'Protected Value' THEN 1 ELSE 0 END, Name
You could change the CASE statement to use some other criteria, like a Masked flag value rather than a specific set value if you wanted to randomize the masked names.
I was facing similar problem where I can control the text, but not the SQL Statement and I wanted to force a single entry to the end of this.
A bit kludgy, but I opted to prefix every in the list with hyphen ("-") EXCEPT the one I wanted at the end.
E.g.
- Apple
- Banana
- Carrot
A thing I wanted to be last

Is there any way to put an invisible character at beginning of a string to change its sort order?

Is there any way to put a non printing or non obtrusive character at the beginning of a string of data in sqlserver. so that when an order by is performed, the string is sorted after the letter z alphabetically?
I have used a space at the beginning of the string to get the string at the top of the sorted list, but I am looking to do something similar to put a string at the end of the list.
I would rather not put another field such as "SortOrder" in the table to use to order the sort, and I would rather not have to sort the list in my code.
Added: Yes I know this is a bad idea, thanks to all for mentioning it, but still, I am curious if what I am asking can be done
Since no one is venturing to answer your question properly, here's my answer
Given: You are already adding <space> to some other data to make them appear top
Solution: Add CHAR(160) to make it appear at the bottom. This is in reality also a space, but is designed for computer systems to not treat it as a word break (hence the name).
http://en.wikipedia.org/wiki/Non-breaking_space
Your requirements:
Without adding another field such as "SortOrder" to the table
Without sorting the list in your code
I think this fits!
create table my(id int,data varchar(100))
insert my
select 1,'Banana' union all
select 2,Char(160) + 'mustappearlast' union all
select 3,' ' +N'mustappearfirst' union all
select 4,'apple' union all
select 5,'pear'
select *
from my
order by ASCII(lower(data)), data
(ok I cheated, I had to add ASCII(lower( but this is closest to your requirements than all the other answers so far)
You should use another column in the database to help specify the ordering rather than modifying the string:
SELECT *
FROM yourtable
ORDER BY sortorder, yourstring
Where you data might look like this:
yourstring sortorder
foo 0
bar 0
baz 1
qux 1
quux 2
If you can't modify the table you might be able to put the sortorder column into a different table and join to get it:
SELECT *
FROM yourtable AS T1
JOIN yourtablesorting AS T2
ON T1.id = T2.T1_id
ORDER BY T2.sortorder, T1.yourstring
Alternative solution:
If you really can't modify the database at all, not even adding a new table then you could add any character you like at the start of the string and remove it during the select:
SELECT RIGHT(yourstring, LEN(yourstring) - 1)
FROM yourtable
ORDER BY yourstring
Could you you include something like:
"<SORT1>This is my string"
"<SORT2>I'd like this to go second"
And remove them later? I think using invisible characters is fragile and hacky.
You could put a sort order in the query and use unions (no guarantees on performance).
select 1 as SortOrder, *
from table
where ... --first tier
union
select 2, *
from table
where ... --second tier
order by SortOrder
In my opinion, an invisible character for this purpose is a bad idea because it pollutes the data. I would do exactly what you would rather not do and add a new column.
To modify the idea slightly, you could implement it not as a sort order, but a grouping order, defaults to 0, where a negative integer puts the group at top of the list and a positve integer at the bottom, and then "order by sort_priority, foo"
I'm with everyone else that the ideal way to do this is by adding an additional column for sort order.
But if you don't want to add another column, and you already use a space for those items you want to appear at the top of the list, how do you feel about using a pipe (|) for items at the bottom of the list?
By default, SQL Server uses a Unicode character set for its sorting. In Unicode, the pipe and both curly brackets ({, }) come after z, so any of those three characters should work for you.

Ordering numbers that are stored as strings in the database

I have a bunch of records in several tables in a database that have a "process number" field, that's basically a number, but I have to store it as a string both because of some legacy data that has stuff like "89a" as a number and some numbering system that requires that process numbers be represented as number/year.
The problem arises when I try to order the processes by number. I get stuff like:
1
10
11
12
And the other problem is when I need to add a new process. The new process' number should be the biggest existing number incremented by one, and for that I would need a way to order the existing records by number.
Any suggestions?
Maybe this will help.
Essentially:
SELECT process_order FROM your_table ORDER BY process_order + 0 ASC
Can you store the numbers as zero padded values? That is, 01, 10, 11, 12?
I would suggest to create a new numeric field used only for ordering and update it from a trigger.
Can you split the data into two fields?
Store the 'process number' as an int and the 'process subtype' as a string.
That way:
you can easily get the MAX processNumber - and increment it when you need to generate a
new number
you can ORDER BY processNumber ASC,
processSubtype ASC - to get the
correct order, even if multiple records have the same base number with different years/letters appended
when you need the 'full' number you
can just concatenate the two fields
Would that do what you need?
Given that your process numbers don't seem to follow any fixed patterns (from your question and comments), can you construct/maintain a process number table that has two fields:
create table process_ordering ( processNumber varchar(N), processOrder int )
Then select all the process numbers from your tables and insert into the process number table. Set the ordering however you want based on the (varying) process number formats. Join on this table, order by processOrder and select all fields from the other table. Index this table on processNumber to make the join fast.
select my_processes.*
from my_processes
inner join process_ordering on my_process.processNumber = process_ordering.processNumber
order by process_ordering.processOrder
It seems to me that you have two tasks here.
• Convert the strings to numbers by legacy format/strip off the junk• Order the numbers
If you have a practical way of introducing string-parsing regular expressions into your process (and your issue has enough volume to be worth the effort), then I'd
• Create a reference table such as
CREATE TABLE tblLegacyFormatRegularExpressionMaster(
LegacyFormatId int,
LegacyFormatName varchar(50),
RegularExpression varchar(max)
)
• Then, with a way of invoking the regular expressions, such as the CLR integration in SQL Server 2005 and above (the .NET Common Language Runtime integration to allow calls to compiled .NET methods from within SQL Server as ordinary (Microsoft extended) T-SQL, then you should be able to solve your problem.
• See
http://www.codeproject.com/KB/string/SqlRegEx.aspx
I apologize if this is way too much overhead for your problem at hand.
Suggestion:
• Make your column a fixed width text (i.e. CHAR rather than VARCHAR).
• Pad the existing values with enough leading zeros to fill each column and a trailing space(s) where the values do not end in 'a' (or whatever).
• Add a CHECK constraint (or equivalent) to ensure new values conform to the pattern e.g. something like
CHECK (process_number LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][ab ]')
• In your insert/update stored procedures (or equivalent), pad any incoming values to fit the pattern.
• Remove the leading/trailing zeros/spaces as appropriate when displaying the values to humans.
Another advantage of this approach is that the incoming values '1', '01', '001', etc would all be considered to be the same value and could be covered by a simple unique constraint in the DBMS.
BTW I like the idea of splitting the trailing 'a' (or whatever) into a separate column, however I got the impression the data element in question is an identifier and therefore would not be appropriate to split it.
You need to cast your field as you're selecting. I'm basing this syntax on MySQL - but the idea's the same:
select * from table order by cast(field AS UNSIGNED);
Of course UNSIGNED could be SIGNED if required.

Resources