Adding together similar columns into one column SQL - sql-server

I'm trying to show the counts of a table of records based on a trackid.
A few of the entries in my ToTransaction column are very similar: Toshi-A,Toshi-B,Toshi-C, Tosan, Toki, Toto
What I want to do in my query is to show all the Toshi's in one row, while still giving Tosan, Toki, and Toto their own rows.
Route ToTransaction Count
F43 Toshi 100
F43 Tosan 200
F43 Toki 75
F43 Toto 125
Instead of
Route ToTransaction Count
F43 Toshi-A 35
F43 Toshi-B 25
F43 Toshi-C 22
F43 Toshi-D 18
F43 Tosan 200
F43 Toki 75
F43 Toto 125
SELECT Route, ToTransaction, count(TrackID) as 'Count' from TestDB
group by Route, ToTransaction

Try this one:
SELECT IF(SUBSTR(ToTransaction,1,5)="Toshi","Toshi",ToTransaction) as "Trans",
COUNT(TracID) as "Count" from TestDB
GROUP BY Trans;

If you use always "-" as seperator and the character count before the "-" character is dynamic you can use substring_index :
For MySQL :
SELECT substring_index(ToTransaction,'-',1) as "Trans",
COUNT(TracID) as "Count" from TestDB
GROUP BY 1;
For MSSQL (I Couldn't try but It should work) :
SELECT CASE WHEN CHARINDEX('-',ToTransaction) > 1
THEN LEFT(ToTransaction,CHARINDEX('-',ToTransaction)-1)
ELSE ToTransaction
END as "Trans",
COUNT(TracID) as "Count" from TestDB GROUP BY Trans;

Related

Dynamically build a SQL Insert statement based on results from a DataView

I have a legacy data logging industrial app that I'm writing a new interface for. The program lets you select points on devices, save those to a profile, then select devices to apply that profile for. When you apply the profile it create a table for each device using the devices unique ID as the table name and creates columns for each point of data you will be logging using the unique point ID. For example I select 3 points of information to datalog and it saves those three as a Profile (into it's own table) and then the point into the Points table tagged with that Profile:
PointID PointName ProfileID
33 Temp23 1
34 Hum14 1
35 Stat 1
I then select a couple devices and apply that profile which saves to the Device table:
DeviceID DeviceName ProfileID
5 NWUnit 1
6 NEUnit 1
After it saves the devices it creates the table per device such as:
Table Name: DEV5
Column 1: PNT1 - Float
Column 2: PNT2 - Float
Column 3: PNT3 - Bit
As you can see the table names are directly related to the device ID and the column names directly related to the point ID. I can add/remove points form the profile, it adds/deletes columns as needed. Apply a different profile and the DEV tables get deleted and recreated. Everything works as expected like the old program that's being replaced.
Now I need to actually do the data logging. I created a simple view:
SELECT dbo.Devices.DeviceID, dbo.Points.PointName, dbo.Points.PointID
FROM dbo.Devices LEFT OUTER JOIN
dbo.Points ON dbo.Devices.ProfileID = dbo.Points.ProfileID
Again so far so good:
DeviceID PointName PointID
5 Temp23 33
5 Hum14 34
5 Stat 35
6 Temp23 33
6 Hum14 34
6 Stat 35
I take this and I throw it in a DataTable, do a Columns.Add("Value") to it to get a blank column, then go through a data retrieval. When it's done I now have the table with the retrieved value:
DeviceID PointName PointID Value
5 Temp23 33 72.34
5 Hum14 34 43.8
5 Stat 35 1
6 Temp23 33 76.80
6 Hum14 34 54.2
6 Stat 35 0
And that's where I'm stuck. I need to take this info, use the DeviceID for the table name and the PointID for the column name, and insert the data. In otherwords I need this:
Dim myParamList As New Dictionary(Of String, Object) From {
{"#SampleTime", Date.Now},
{"#DevTable", "Dev" & r.Item("DeviceID")},
HOW DO I CYCLE THROUGH TO GET THE COLUMNS HERE?
}
UpdateDatabase(MySQLConnection, "INSERT INTO #DevTable (SampleTime, AND HERE?) VALUES (#SampleTime, AND HERE)", myParamList)
I cannot figure out the cycling through part. I thought I should use a Count + Group By to find out how many rows have the same device ID, like DeviceID 5 has 3 rows, and use that to cycle through that number of times but I'm just stuck trying to figure out how.
Any suggestions on the best way to do this?
So after struggling with trying to do a GroupBy on a dataview I decided to just do another database query with a Count(*) and GroupBy DeviceID to grab my unique DeviceIDs:
DeviceID RowCount
5 3
6 3
I then used that to loop through the device ID's and used the ID to filter myView as needed. Then I dynamically created a parameterized SQL string and update the database:
For Each r As DataRow In DevIDDataset.Tables("DeviceIDs").Rows
myView.RowFilter = "DeviceID=" & r.Item("DeviceID")
Dim myParamList As New Dictionary(Of String, Object) From {
{"#SampleTime", Date.Now}
}
Dim myFields As String = "SampleTime"
Dim myValues As String = "#SampleTime"
For Each row As DataRowView In myView
Dim myPointID As String = row.Item("PointID")
myFields += ",obj" & myPointID
myParamList.Add("#obj" & myPointID, row.Item("RetrievedValue"))
myValues += ",#obj" & myPointID
Next
UpdateDatabase(MySQLConnection, "INSERT INTO dev" & r.Item("DeviceID") & " (" & myFields & ") VALUES (" & myValues & ")", myParamList)
Next
Not pretty but it does what it needs to do and I can't think of any other way to do it.

Add incremental number in duplicate records

I have SSIS package, which retrieves all records including duplicates. My question is how to add an incremental value for the duplicate records (only the ID and PropertyID).
Eg
Records from a Merge Join
ID Name PropertyID Value
1 A 1 123
1 A 1 223
2 B 2 334
3 C 1 22
3 C 1 45
Now I need to append an incremental value at the end of the each record as
ID Name PropertyID Value RID
1 A 1 123 1
1 A 1 223 2
2 B 2 334 1
3 C 1 22 1
3 C 1 45 2
Since ID 1 & 3 are returned twice, the first record has RID as 1 and the second record as 2.
ID and PropertyID need to be considered to generate the Repeating ID i.e RID.
How can I do it in SSIS or using SQL command?
Update #1:
Please correct me if I'm wrong, since the data is not stored in any table yet, I'm unable to use the select query using rownumber(). Any way I can do it from the Merge Join?
You could use ROW_NUMBER:
SELECT ID,
Name,
PropertyID,
Value,
ROW_NUMBER() OVER(PARTITION BY ID, PropertyID ORDER BY Value) As RID
FROM TableName
This will do the job for you: https://paultebraak.wordpress.com/2013/02/25/rank-partitioning-in-etl-using-ssis/
You will need to write a custom script, something like this:
public
class
ScriptMain : UserComponent
{
string _sub_category = “”;
int _row_rank = 1;
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
if (Row.subcategory != _sub_category)
{
_row_rank = 1;
Row.rowrank = _row_rank;
_sub_category = Row.subcategory;
}
else
{
_row_rank++;
Row.rowrank = _row_rank;
}
}
}

How to import column value in Cassandra like one having such values "13/01/09 23:13"?

Query:
CREATE TABLE IF NOT EXISTS "TEMP_tmp".temp (
"Date_Time" timestamp,
PRIMARY KEY ("Date_Time")
);
CSV Contains "13/01/09 23:13" values.
Error : Failed to import 1 rows: ParseError - Failed to parse 13/01/09 23:13 : invalid literal for long() with base 10: '13/01/09 23:13', given up without retries.
What Data Type should I Use ?
Default Cqlsh timestamp format is : year-month-day hour:min:sec+timezone
Example :
2017-02-01 05:28:36+0000
You either change your date format to above or you can change the format from cqlshrc file
Check this answer custom cassandra / cqlsh time_format
cassandra will store timestamp as 2017-02-01 08:28:21+0000. For example, if I store a timestamp in your described table "TEMP_tmp".temp:
cassandra#cqlsh> INSERT INTO TEMP_tmp.temp ("Date_Time") VALUES ( toTimestamp(now()));
cassandra#cqlsh> SELECT * FROM TEMP_tmp.temp;
Date_Time
--------------------------
2017-02-01 09:14:29+0000
If we copy all the data to csv:
cassandra#cqlsh> COPY Temp_tmp.temp TO 'temp.csv';
temp.csv will contain:
2017-02-01 09:14:29+0000
If we truncate the table:
cassandra#cqlsh> TRUNCATE TABLE TEMP_tmp.temp;
cassandra#cqlsh> SELECT * FROM TEMP_tmp.temp;
Date_Time
--------------------------
Then if we import temp.csv:
cassandra#cqlsh> COPY Temp_tmp.temp FROM 'temp.csv';
Using 1 child processes
Starting copy of Temp_tmp.temp with columns [Date_Time].
Processed: 1 rows; Rate: 1 rows/s; Avg. rate: 1 rows/s
1 rows imported from 1 files in 0.746 seconds (0 skipped).
If you want custom date/time format, then follow Ashraful Islam's answer from your question.

More efficient SQL apportionment view - would a function be better?

I'm writing a view that apportions costs across different sales departments and I'm wondering if there's a better way to do this.
Lets say I've got some code that outputs this...
Dept Type Oct Nov Dec
DeptA SalesA 10000 20000 5000
DeptA SalesB 4000 2000 8200
DeptA SalesC 6000 7000 4000
DeptB SalesA 12000 4000 6333
DeptB SalesB 8445 3880 4500
DeptB SalesC 8700 8740 6500
General Costs1 890 5874 138
General Costs2 545 547 320
General Costs3 2674 354 214
and I want to apportion 'Costs1' across the sales departments and report back on 'SalesB' from 'DeptA like this....
Oct Nov Dec
SalesB 4000.00 2000.00 8200.00
Approtioned cost1 152.94 499.59 17.98
..(where 'Apportioned cost1' = Cost1 * (DeptA.SalesB / Total sales) ).
The code that I've written so far to do this looks like this...
select
Dept,
Type,
[Oct-15],
[Nov-15],
[Dec-15]
from ProfitandLoss
where Type like 'SalesB%'
and Dept like 'DeptA'
union
select
'Apportioned Costs1' as 'Dept',
'' as 'Type',
round(((round((select sum([Oct-15]) from ProfitandLoss where Type = 'Costs1'),2))* (round((select sum([Oct-15]) from ProfitandLoss where Type like 'SalesB%' and Dept like 'DeptA'),2) / round((select sum([Oct-15]) from ProfitandLoss where type like 'Sales%'),2))),2) as 'Oct-15',
round(((round((select sum([Nov-15]) from ProfitandLoss where Type = 'Costs1'),2))* (round((select sum([Nov-15]) from ProfitandLoss where Type like 'SalesB%' and Dept like 'DeptA'),2) / round((select sum([Nov-15]) from ProfitandLoss where type like 'Sales%'),2))),2) as 'Nov-15',
round(((round((select sum([Dec-15]) from ProfitandLoss where Type = 'Costs1'),2))* (round((select sum([Dec-15]) from ProfitandLoss where Type like 'SalesB%' and Dept like 'DeptA'),2) / round((select sum([Dec-15]) from ProfitandLoss where type like 'Sales%'),2))),2) as 'Dec-15'
from ProfitandLoss
Due to the number of Select statements within the query, this is currently taking 43 seconds to run - at the moment this is just for one department over three months. I need to run this for 34 departments over 12 months! -is there a more efficient way to do this? ..would a function that stores the percentages of total sales for each department that I can then refer to in the query rather than sub-querying multiple times to get individual numbers be any quicker?
...also assume that the costs are not apportioned evenly...for example 'Costs2' is only split between DeptA.SalesA and DeptB.SalesA....I need to make specific apportionments rather than everything divide by the total.

Querying table for multiple combinations and default values

I have a table like
City POSTAGE PRICE**
HOUSTON DEFAULT 20
DEFAULT AIR 14
DEFAULT GROUND 30
DEFAULT DEFAULT 40
Now i want to query for price on this table with a combination like 'CHICAGO,GROUND'
which should check if the perfect combination exists, else i should substitute DEFAULT and search for the value..
example,
HOUSTON,AIR should return 14
HOUSTON,GROUND should return 20
HOUSTON,FEDEX should return 20
CHICAGO,FEDEX should return 40
Is there a way to achieve this instead of writing multiple queries ..
thank you!
This uses the SQL*Plus syntax for passing parameters, you may need to change that to suit:
select price
from your_table
where ( city = '&p_city' or city = 'DEFAULT')
and ( postage = '&p_postage' or postage = 'DEFAULT')
order by case when city = '&p_city' then 1 else 2 end
, case when postage = '&p_postage' then 1 else 2 end
This will return multiple rows but presumably you want only the one PRICE. The ORDER BY clause prioritises matches on CITY over matches on POSTAGE. You can then select the first row.

Resources