Formatting an sql column to contain only time in minutes - sql-server

I have an SQL table that has this data
I need the data to be formatted so that instead of showing this string of numbers and characters, I want to show time in minutes without the string. For example (in minutes):
88
85
85
67
63
76
71
75
75
42

I echo with Larnu comment.
You can try something like below.
declare #string varchar(20) = '1 hour 28 mins'
Select #string,case when CHARINDEX('hour',#string)>1 then
SUBSTRING(#string,1,CHARINDEX('hour',#string)-1) * 60 else 0 end
+
case when CHARINDEX('mins',#string)>1 then
SUBSTRING(#string,CHARINDEX('mins',#string)-3,2) else 0 end

Related

capture reoccurring seventh day in new column

I have the below table...
run_dt check_type curr_cnt
6/1/21 ALL 50
5/31/21 ALL 25
5/26/21 ALL 43
5/25/21 ALL 70
6/1/21 SUB 23
5/25/21 SUB 49
I would like to capture the value of what the check_type was seven days from the run_dt. What was the previous weekday value.
Something like...
run_dt check_type curr_cnt prev_nt
6/1/21 ALL 50 70
5/31/21 ALL 25
5/26/21 ALL 43
5/25/21 ALL 70
6/1/21 SUB 23 49
5/25/21 SUB 49
Can I use lead/lag or CTE?
What's the best option here, appreciate the feedback.
You could join the table to itself:
SELECT
a.run_dt,
a.check_type,
a.curr_cnt,
b.curr_cnt as prev_nt
from table a
left join table b on b.run_dt = dateadd(d,-7,a.run_dt)

Possible (invisible) characters in text pasted from Excel or XML to a TSQL query

My ASP.Net application receives an uploaded Excel file, then converts to XML to pass as a stored procedure parameter for processing. This file contains lots of entries that are text-matched to find the corresponding value.
Today (after working perfectly for years) three seemingly identical values caused mixed results on the text match. I can only illustrate with an example:
The first value is the original value within Excel (copied from the
spreadsheet).
The second value is that which the web application
saves to XML, and is the raw value taken from the profiler showing
the parameter values
The third value is that which currently exists
in the table
The code is pasted from SSMS:
SELECT SkillGroupTitle FROM tbl_SkillGroups WHERE SkillGroupTitle = N'Quality Assurance Instruction (QAI)'; -- pasted from Excel
SELECT SkillGroupTitle FROM tbl_SkillGroups WHERE SkillGroupTitle = N'Quality Assurance Instruction (QAI)'; -- pasted from xml
SELECT SkillGroupTitle FROM tbl_SkillGroups WHERE SkillGroupTitle = N'Quality Assurance Instruction (QAI)'; -- pasted from SQL table (existing value)
Can somebody please advise what is happening here? All values appear identical visually, but there's clearly something different within the inbound values from Excel.
Update
Pasting the two values into a hex converter, there are indeed three differences.
Excel data:
51 75 61 6c 69 74 79 a0 41 73 73 75 72 61 6e 63 65 a0 49 6e 73 74 72 75 63 74 69 6f 6e 20 28 51 41 49 29 0a
-- -- --
SQL data:
51 75 61 6c 69 74 79 20 41 73 73 75 72 61 6e 63 65 20 49 6e 73 74 72 75 63 74 69 6f 6e 20 28 51 41 49 29
Can anyone shed any light here please?
Updated
First, to identify hidden characters causing strings not to be equal you can leverage ngrams8k like this:
WITH tbl_SkillGroups(dSource, SkillGroupTitle) AS
(
SELECT 'Excel', N'Quality Assurance Instruction (QAI)' -- pasted from Excel
UNION ALL
SELECT 'XML', N'Quality Assurance Instruction (QAI)'+CHAR(10) -- pasted from xml
UNION ALL
SELECT 'SQL', N'Quality Assurance Instruction (QAI)'+CHAR(13) -- pasted from SQL table (existing value)
)
SELECT
[Source] = ts.dSource,
Position = ng.Position,
Token = ng.Token,
asciiValue = ASCII(ng.Token)
FROM tbl_SkillGroups AS ts
CROSS APPLY samd.ngrams8k(ts.SkillGroupTitle,1) AS ng
--WHERE ng.Position > 32 -- Zoom into the last few characters
Returns:
Source Position Token asciiValue
------ --------- ------- -----------
Excel 33 A 65
Excel 34 I 73
Excel 35 ) 41 -- Only 35 characters
XML 33 A 65
XML 34 I 73
XML 35 ) 41
XML 36 10 -- 36th character is a CHAR(10) (Looks like a space)
SQL 33 A 65
SQL 34 I 73
SQL 35 ) 41
SQL 36 13 -- 36th character is a CHAR(13) (Also looks like a space)
NEXT, to clean hidden characters from your inputs you can use PatReplace8k.
WITH tbl_SkillGroups(dSource, SkillGroupTitle) AS
(
SELECT 'Excel', N'Quality Assurance Instruction (QAI)' -- pasted from Excel
UNION ALL
SELECT 'XML', N'Quality Assurance Instruction (QAI)'+CHAR(10) -- pasted from xml
UNION ALL
SELECT 'SQL', N'Quality Assurance Instruction (QAI)'+CHAR(13) -- pasted from SQL table (existing value)
)
SELECT SkillGroupTitle
FROM tbl_SkillGroups as ts
CROSS APPLY PatReplace8K(ts.Skillgrouptitle,'[^a-zA-Z ()]','') as pr
WHERE SkillGroupTitle = pr.NewString
Returns:
SkillGroupTitle
------------------------------------
Quality Assurance Instruction (QAI)
Quality Assurance Instruction (QAI)
Quality Assurance Instruction (QAI)
Here, PatReplace8k would remove any characters that didn't match the pattern (letters, spaces and parentheses) thus making these three values equal.

Array manipulation in Perl

The Scenario is as follows:
I have a dynamically changing text file which I'm passing to a variable to capture a pattern that occurs throughout the file. It looks something like this:
my #array1;
my $file = `cat <file_name>.txt`;
if (#array1 = ( $file =~ m/<pattern_match>/g) ) {
print "#array1\n";
}
The array looks something like this:
10:38:49 788 56 51 56 61 56 59 56 51 56 80 56 83 56 50 45 42 45 50 45 50 45 43 45 54 10:38:51 788 56 51 56 61 56 59 56 51 56 80 56 83 56 50 45 42 45 50 45 50 45 43 45 54
From the above array1 output, the pattern of the array is something like this:
T1 P1 t1(1) t1(2)...t1(25) T2 P2 t2(1) t2(2)...t2(25) so on and so forth
Currently, /g in the regex returns a set of values that occur only twice (only because the txt file contains this pattern that number of times). This particular pattern occurrence will change depending on the file name that I plan to pass dynamically.
What I intend to acheive:
The final result should be a csv file that contains these values in the following format:
T1,P1,t1(1),t1(2),...,t1(25)
T2,P2,t2(1),t2(2),...,t2(25)
so on and so forth
For instance: My final CSV file should look like this:
10:38:49,788,56,51,56,61,56,59,56,51,56,80,56,83,56,50,45,42,45,50,45,50,45,43,45,54
10:38:51,788,56,51,56,61,56,59,56,51,56,80,56,83,56,50,45,42,45,50,45,50,45,43,45,54
The delimiter for this pattern is T1 which is time in the format \d\d:\d\d:\d\d
Example: 10:38:49, 10:38:51 etc
What I have tried so far:
use Data::Dumper;
use List::MoreUtils qw(part);
my $partitions = 2;
my $i = 0;
print Dumper part {$partitions * $i++ / #array1} #array1;
In this particular case, my $partitions = 2; holds good since the pattern occurrence in the txt file is only twice, and hence, I'm splitting the array into two. However, as mentioned earlier, the pattern occurrence number keeps changing according to the txt file I use.
The Question:
How can I make this code more generic to achieve my final goal of splitting the array into multiple equal sized arrays without losing the contents of the original array, and then converting these mini-arrays into one single CSV file?
If there is any other workaround for this other than array manipulation, please do let me know.
Thanks in advance.
PS: I considered Hash of Hashes and Array of Hashes, but that kind of a data structure did not seem to be healthy solution for the problem I'm facing right now.
As far as I can tell, all you need is splice, which will work fine as long as you know the record size and it's constant
The data you showed has 52 fields, but the description of it requires 27 fields per record. It looks like each line has T, P, and t1 .. t24, rather than ending at t25
Here's how it looks if I split the data into 26-element chunks
use strict;
use warnings 'all';
my #data = qw/
10:38:49 788 56 51 56 61 56 59 56 51 56 80 56 83 56 50 45 42 45 50 45 50 45 43 45 54 10:38:51 788 56 51 56 61 56 59 56 51 56 80 56 83 56 50 45 42 45 50 45 50 45 43 45 54
/;
while ( #data ) {
my #set = splice #data, 0, 26;
print join(',', #set), "\n";
}
output
10:38:49,788,56,51,56,61,56,59,56,51,56,80,56,83,56,50,45,42,45,50,45,50,45,43,45,54
10:38:51,788,56,51,56,61,56,59,56,51,56,80,56,83,56,50,45,42,45,50,45,50,45,43,45,54
If you wanted to use List::MoreUtils instead of splice, the the natatime function returns an iterator that will do the same thing as the splice above
Like this
use List::MoreUtils qw/ natatime /;
my $iter = natatime 26, #data;
while ( my #set = $iter->() ) {
print join(',', #set), "\n";
}
The output is identical to that of the program above
Note
It is very wrong to start a new shell process just to use cat to read a file. The standard method is to undefine the input record separator $/ like this
my $file = do {
open my $fh, '<', '<file_name>.txt' or die "Unable to open file for input: $!";
local $/;
<$fh>;
};
Or if you prefer you could use File::Slurper like this
use File::Slurper qw/ read_binary /;
my $file = read_binary '<file_name>.txt';
although you will probably have to install it as it is not a core module

comparing multiple column files using python3

input_file1:
a 1 33
a 34 67
a 68 78
b 1 99
b 100 140
c 1 70
c 71 100
c 101 190
input file2:
a 5 23
a 30 72
a 76 78
b 5 30
c 23 88
c 92 98
I want to compare these two files such that for every value of 'a' in file2 the two integers (boundary) fall in the range (boundaries) of 'a' in file1 or between two ranges.
Instead of storing values like this 'a 1 33', you can make one structure (like 'a:1:33') for your data while writing into file. So that it will become easy to read data also.
Then, you can read each line and can split it based on ':' separator and you can compare with another file easily.

NUMERIC and VARCHAR

I am using SQL Server 2008 R2 to run queries and I have come across a database where it stores numeric values as varchar(4). For example:
SELECT [num]
FROM [TABLE1]
WHERE num > '95'
I get the below results
96
97
98
99
999
However when I run the same query without the '' i.e.
SELECT [num]
FROM [TABLE1]
WHERE num > 95
then I get
100
101
102
103
104
105
106
107
108
109
110
111
112
113
116
117
120
7001
7002
7003
7004
7005
7006
7007
96
97
98
99
999
In any case, I am not getting numbers in order i.e. 95, 96, 97, 98, 99. I understand this is because they are stored as varchar(4) i.e. of a string format. Please can someone explain what happens in both situations and how does a string compare in both the above cases?
Also if someone can help me write the code to change these varchar(4) into numeric on the fly so I can arrange them properly?
Much appreciated.
When you use > '95' it compares the "numbers" in alphabetical order, that's why the result is like that. When you use > 95 it type casts the column into a number and that's why the different result.
To be sure what actually happens, you should do the casting yourself. And of course you should not store numbers as varchars.
The correct ordering would be with
order by convert(int, num)
but it will fail if there's non-numeric fields in the table.
The > does a lexicographical comparison on strings, not numbers. So the output is in order of a string (order by ASC).

Resources