Pretty new to BI and SQL in general, but a few months ago I didn't even know what a model is and now here I am...trying to build a package that runs daily.
Currently running this is Excel via PowerQuery but because the data is so much, I have to manually change the query every month. Decided to move it into SSIS.
Required outcome: Pull the last date in my Database and use it as a variable in the model (as I have millions of rows, I only want to load lines with dates greater than what I have in my table already).
Here is my Execute SQL Task:
I set up a variable for the SQL query
and trying to use it in my OLE DB query like this
Execute SQL Task: results, are fine - returns date as "dd/mm/yyyy hh24:mi:ss"
SELECT MAX (CONVACCT_CREATE_DATE) AS Expr1 FROM GOMSDailySales
Variable for OLE DB SQL Query:
"SELECT fin_booking_code, FIN_DEPT_CODE, FIN_ACCT_NO, FIN_PROD_CODE, FIN_PROG_CODE, FIN_OPEN_CODE, DEBIT_AMT, CREDIT_AMT, CURRENCY_CODE, PART_NO, FIN_DOC_NO, CREATE_DATE
FROM cuown.converted_accounts
WHERE (CREATE_DATE > TO_DATE(#[User::GetMaxDate],'yyyy/mm/dd hh24:mi:ss'))
AND (FIN_ACCT_NO LIKE '1%')"
Currently getting missing expression error, if I add " ' " to my #[User::GetMaxDate], I get a year must be between 0 and xxxx error.
What am I doing wrong / is there a cleaner way to get this done?
In the OLEDB source use the following, change the data access mode to SQL command, and use the following command:
SELECT fin_booking_code, FIN_DEPT_CODE, FIN_ACCT_NO, FIN_PROD_CODE, FIN_PROG_CODE, FIN_OPEN_CODE, DEBIT_AMT, CREDIT_AMT, CURRENCY_CODE, PART_NO, FIN_DOC_NO, CREATE_DATE
FROM cuown.converted_accounts
WHERE (CREATE_DATE > TO_DATE(?,'yyyy/mm/dd hh24:mi:ss'))
AND (FIN_ACCT_NO LIKE '1%')
And click on the parameters button and map #[User::GetMaxDate] to the first parameter.
For more information, check the following answer: Parameterized OLEDB source query
Alternative method
If parameters are not supported in the OLE DB provider you are using, create a variable of type string and evaluate this variable as the following expression:
"SELECT fin_booking_code, FIN_DEPT_CODE, FIN_ACCT_NO, FIN_PROD_CODE, FIN_PROG_CODE, FIN_OPEN_CODE, DEBIT_AMT, CREDIT_AMT, CURRENCY_CODE, PART_NO, FIN_DOC_NO, CREATE_DATE
FROM cuown.converted_accounts
WHERE CREATE_DATE > TO_DATE('" + (DT_WSTR, 50)#[User::GetMaxDate] +
"' ,'yyyy/mm/dd hh24:mi:ss') AND FIN_ACCT_NO LIKE '1%'"
Then from the OLE DB source, change the data access mode the SQL Command from variable and select the string variable you created.
Your trying to use the SSIS variable like a variable in the query. When constructing a SQL query in a string variable you simply need to concatenate the strings together. The expression for your query string variable should look like this.
"SELECT fin_booking_code, FIN_DEPT_CODE, FIN_ACCT_NO, FIN_PROD_CODE, FIN_PROG_CODE, FIN_OPEN_CODE, DEBIT_AMT, CREDIT_AMT, CURRENCY_CODE, PART_NO, FIN_DOC_NO, CREATE_DATE
FROM cuown.converted_accounts
WHERE CREATE_DATE > " + #[User::GetMaxDate] +
"AND (FIN_ACCT_NO LIKE '1%')"
Related
I'm attempting to edit an ETL package(SSIS) that queries a SQL table and outputs csv files for every StationID and I'm having trouble understanding how the question mark is being used in the query definition below. I understand ? is used a parameter but I don't understand how it's used in the date function below:
SELECT TimeSeriesIdentifier, StationID, ParameterID FROM dbo.EtlView WHERE
LastModified > DATEADD(hour, ?*-1, GETDATE())
AND StationID LIKE
CASE WHEN ? = 0 THEN
StationID
ELSE
?
END
The parameterization available in SSIS is dependent upon the connection manager used.
OLE DB and ODBC based connection managers use ? as the variable place holder, whereas ADO.NET uses a named parameter, #myVariable.
OLE DB begins counting at 0 whereas ODBC used a 1 based counting system. They are both however ordinal based systems so in your CASE expression the two ? are for the same variable. But, you'll have to list that SSIS Variable twice in the parameter mapping dialog because it's ordinal based - i.e. (param, name) => #HoursBack, 0; #MyVar, 1; and #MyVar, 2;
A "dumb trick" I would employ if I had to deal with repeated ordinal based parameters or if I was troubleshooting packages is to make the supplied query use local variables in the query itself.
DECLARE
#HoursBack int = ?
, #MyVariable int = ?;
SELECT
TimeSeriesIdentifier
, StationID
, ParameterID
FROM
dbo.EtlView
WHERE
LastModified > DATEADD(HOUR, #HoursBack * -1, GETDATE())
AND StationID LIKE
CASE
WHEN #MyVariable = 0 THEN StationID
ELSE #MyVariable
END;
Now I only have to map the SSIS Variable #MyVar once into my script as the "normal" TSQL parameterization takes over. The other benefit is that I can copy and paste that into a query tool and sub in the ?s with actual values to inspect the results directly from the source. This can be helpful if you're running into situations where the strong typing in SSIS prevents you from getting the results into a data viewer.
SSIS is building a parameterized query.
You can get more information about this here (MySQL-specific):
What is the question mark's significance in MySQL at "WHERE column = ?"?
Or you can get a more generally-applicable response here: What does a question mark represent in SQL queries?
At a very "nuts and bolts" level, those are parameters being passed into the SQL statement by the package. With the Execute SQL task open, click on the tab that says Parameter Mapping. There will be a list of variables that are being sent into the query, and they are consumed in the order that they're listed.
Here's a logger for an archiving package I'm working on:
The query on the General tab just writes those five values to a table:
INSERT INTO dbo.ArchiveRowCounts (
TableName,
ServerName,
ReportYear,
BaseTblCnt,
ArchiveTblCnt)
VALUES (?,?,?,?,?);
I edited the question based on the solution that Hadi gave.
I am using SSIS in VS 2013.
I have a user variable called MyVariableList and Query.
This is Expression in user variable Query: "SELECT cola, colB FROM myTable WHERE myID IN (" + #[User::MyVariableList] + ")"
I have a Script Task that set the value of #[User::MyVariableList].
Dts.Variables["User::MyVariableList"].Value = sList;
After that, I have A Data Flow Task with OLE DB Source (from 1 database) to another OLE DB Destination (another database on another server).
In the OLE DB Source Editor, I set
Data access mode: SQL Command from variable
Variable name: User:: Query
In the OLE DB Source connection, I have set the DelayValidation to True
Before I even can run the package, I am getting this error
How can I fix this issue ? Thank you
First of all, you are working with a project parameter not a project variable
You cannot achieve this using a parameterized sql command, you have to build the whole query inside a variable, and use it as source (SQL Command from variable)
Parameters are used to specify a single value, not concatenating the query string
Create a SSIS variable (User::Query), change the variable Evaluate As Expression property to True and write the expression in the variable Expression property. Like the following
"SELECT cola, colB FROM myTable WHERE myID IN (" + #[$Project::MyProjectVariable] + ")"
Note: to use a project parameter inside an expression use the following syntax : #[$Project::MyProjectVariable]
I'm trying to load incremental data from ODBC server to SQL server using common table expression.
When running the query in the Dbeabver application, is executed correctly:
with test as
(
SELECT userid,sum(goldbalance)
FROM Server.events_live
where eventTimestamp>=DATE '2016-01-01' + INTERVAL '-100 day'
group by userid
order by sum(goldbalance) desc)
)
select * from test
when running it from an sql command expression of the ODBC source, it fails due to wrong syntax. It looks as follow:
with test as
(
SELECT userid,sum(goldbalance)
FROM deltadna.events_live
where eventTimestamp>=DATE '"+#[User::datestring]+"' + INTERVAL '-100 day'
group by userid
order by sum(goldbalance) desc)
)
select * from test"
the datestring variable is getting the server date and convert it to string in the format yyyy-mm-dd. I'm usually use this method to pull data from ADO.NET and it works properly.
Is there any other way to pull incremental data from ODBC server using ssis variables?
With OLE DB
Try this code, it works for me with my own tables with SQL Server :
SELECT userid,sum(goldbalance) AS SUMGOLD
FROM deltadna.events_live
WHERE eventTimestamp >= DATEADD(DAY, -100,CONVERT(DATE,?))
GROUP BY userid
ORDER BY SUMGOLD desc
You have to click on Parameters in the OLEDB Source Editor to configure what you need. Use the '?' to represent a variable in your query.
If you query if too complicated, stored it in a stored procedure and call it like this:
EXEC shema.storedProcedureName ?
And map the '?' to your variable #user::DateString
With ODBC
The expressions are outside the data flow in Data Flow Properties.
Select the expression property and add your dynamic query.
And your expression will be
"SELECT userid,sum(goldbalance) AS SumGold
FROM deltadna.events_live
where eventTimestamp>=DATE "+#[User::datestring]+" +INTERVAL '-100 day'
group by userid
order by SumGold desc"
It seemed to be an easy task but I fail and do not find a solution for my problem: I have a local table in Access 2010 with a date/time column and I want to update a column in a SQL Server table with a datatype date.
Sending the date/time values direct to the SQL Server table fails, same with converting the date/time column with this VBA function:
Function DateForSQL(dteDate) As String
DateForSQL = "'" & Format(CDate(dteDate), "yyyy-mm-dd") & "'"
End Function
which gives
DateForSQL(Date()) = '2016-01-14'
and should work, I assumed.
The update command is this:
UPDATE SQL_table
INNER JOIN local_table ON SQL_table.ID = local_table.ID
SET SQL_table.DateField = DateForSQL(local_table.DateField)
But it fails again in Access with a type conversion error.
Even when changing the SQL Server table column to datetime I get the same error.
Same with sending to SQL a string like '14/01/2016' or '01/14/2016'.
The only thing I could do - eventually - is to change the datetime to text in Access and try again, but this could not be the only solution.
Any help?
Thanks
Michael
First of all, I recommend to use the ISO-8601 format for your date - which is YYYYMMDD (no dashes, nothing) - this works for all regional & language settings in SQL Server.
Next, I'm not sure about Access' SQL syntax, but in SQL Server, your UPDATE statement would be to be something like this:
UPDATE sql
SET sql.DateField = DateForSQL(local_table.DateField)
FROM local_table local
INNER JOIN SQL_table sql ON local.ID = sql.ID
First UPDATE, then SET, then FROM and INNER JOIN ...
OK this seems like it should be insanely easy, but I cannot figure it out. Every where I look online says to create temp tables and VB scripts and I cannot believe I have to do that. My goal is to insert all the records in a table with a date later than the max date in that destination table.
UPDATE The 2 tables are in two different non linked SQL databases
So:
Select #[User::Dated] = MAX(Dateof) from Table2
Insert into Table2
Select *
From Table1
Where DateOf > #[User::Dated]
I am trying to do this in SSIS. I declared a variable, the SQL execution step looks like it is assigning the single row output to it. But when I got go into the data flow it give me no parameters to choose, when I force the known parameter which is in the project scope it says no parameter exists
Create two OLE DB data sources each pointing at you two databases.
Create a variable called max_date and make its data type String.
Place an Execute SQL Task on the Control Flow, change its connection type to OLE DB and for the connection select the name of the data source that contains Table2. Set the ResultSet to Single Row. Add the following for the SQLStatement:
SELECT CAST(MAX(Dateof) AS VARCHAR) AS max_date FROM Table2
Go to the Result Set pane, click Add and enter the following:
Result Name: max_date
Variable Name: User::max_date
You can now use the max_date variable in an expression to create a SQL statement, for example you could use it in another Execute SQL Task which would use the second Data Connection like so:
"INSERT INTO Table2
SELECT *
FROM Table1
WHERE DateOf > '" + #[User::max_date] + "'"
Or in an OLE DB Source in a data flow like so:
"SELECT *
FROM Table1
WHERE DateOf > '" + #[User::max_date] + "'"
You can do this in a single SQL Task if you want:
Insert into Table2
Select *
From Table1
Where DateOf > (Select MAX(Dateof) from Table2)
If you want to use multiple Execute SQL Task items in the control flow, or want to make use of the parameter in a data flow instead, you have to change the General > Result Set option for your MAX() query to Single Row, then move from General to Result Set and Add a new variable for your result set to occupy.
To use that variable in your INSERT INTO.... query via Execute SQL Task, you'll construct your query with a ? for each parameter and map them in the parameter mapping section. If a variable is used multiple times in a query it's easiest to use a stored procedure, so you can simply pass the relevant parameters in SSIS.