Related
I have a very simple SQL query in my SSIS (VS 2017) Data Flow. It connects to Oracle via Native OLE DB\Oracle Provider for OLE DB and uses SQL Command to query the Oracle view. The destination table is a SQL Server 2017 table. If I query only the first 20 columns or so (I am querying 57 columns), I get all 1,060,000ish records. As I start to add more columns, the rowcount drops. I have already removed any date fields from both tables, and have done quite a few data conversions (source table has several varchar2(4000) fields that need to be SUBSTR to reasonable lengths in the SQL destination table. All fields in the destination table are nullable. When I pull the SQL out of SSIS and run it in SQL Developer, I get the right row count. When I run it in SSIS, it drops from 1.06 M rows to around 28k. I already tried the SQLChick hack (https://www.sqlchick.com/entries/2012/9/2/resolving-missing-records-in-ssis-from-oracle-source.html) doesn't work and causes connection errors (I had to use VS Code to add that property to my Oracle connection, then when I went back to VS, the connection was broken. When opening it back up to re-enter connection credentials, the extra property gets dropped.) I have reduced and increased the Rows per Batch and Maximum insert commit size values to zero avail. I have also set the RetainSameConnection property to True for all the Connection Managers. I'm at a loss! (As you can see from the pics, both jobs finish "successfully".)
This code returns all records:
SELECT
PIDM,
STUDENT_ID,
LAST_NAME,
FIRST_NAME,
MIDDLE_NAME,
LFM_NAME,
FML_NAME,
SORT_NAME,
GENDER,
ETHNIC_CODE,
ETHNIC_CODE_DESC,
LEGACY_CODE,
LEGACY_CODE_DESC,
ADDR_STR_LINE1,
ADDR_STR_LINE2,
ADDR_STR_LINE3,
ADDR_CITY,
ADDR_COUNTY,
ADDR_STATE,
ADDR_NATION,
ADDR_ZIPCODE,
ADDR_AREA_CODE,
ADDR_PHONE
FROM <TABLE_NAME>
This code returns only 28k:
SELECT
PIDM,
STUDENT_ID,
LAST_NAME,
FIRST_NAME,
MIDDLE_NAME,
LFM_NAME,
FML_NAME,
SORT_NAME,
GENDER,
ETHNIC_CODE,
ETHNIC_CODE_DESC,
LEGACY_CODE,
LEGACY_CODE_DESC,
ADDR_STR_LINE1,
ADDR_STR_LINE2,
ADDR_STR_LINE3,
ADDR_CITY,
ADDR_COUNTY,
ADDR_STATE,
ADDR_NATION,
ADDR_ZIPCODE,
ADDR_AREA_CODE,
ADDR_PHONE,
ORIGIN_STR_LINE1,
ORIGIN_STR_LINE2,
ORIGIN_STR_LINE3,
ORIGIN_CITY,
ORIGIN_COUNTY,
ORIGIN_NATION,
ORIGIN_STATE,
ORIGIN_ZIPCODE,
EMAIL,
HIGH_SCHOOL_CODE,
HIGH_SCHOOL_CODE_DESC,
HIGH_SCHOOL_CITY,
HIGH_SCHOOL_STATE,
HIGH_SCHOOL_GPA,
HIGH_SCHOOL_RANK,
PRIOR_COLLEGE_CODE,
PRIOR_COLLEGE_CODE_DESC,
PRIOR_COLLEGE_DEGREE_CODE,
PRIOR_COLLEGE_DEGREE_CODE_DESC,
PRIOR_COLLEGE_CITY,
PRIOR_COLLEGE_STATE,
ADMIT_FLAG,
GENERAL_STUDENT_FLAG,
CURRENT_ENROLLMENT_FLAG,
LETTER_CODES,
CONTACT_CODES,
COMMENT_CODES,
DIRECTORY_EMAIL,
ADDR_DIVISION_CODE,
HIGH_SCHOOL_CLASS_SIZE,
ETHNICITY,
RACE_CODE,
REGULATORY_RACE,
INT_LANG
FROM <TABLE_NAME>
Troubleshooting steps from the comments
If you run the all column version of the query in sql developer (whatever the Oracle query tool is) using the same credentials as the SSIS package, do you get 28k rows or 1M?
1M records are returned in SQL Developer when I use the same credentials SSIS is using. –
As painful as it may be, I would add 1 column, run, observe results. The first time you see a drop in row count, interrogate the heck of the source data (data type, collation, whether some permission thing is at play). If nothing seems out of place, edit the question to include the full table definition and identify what the first source column is that is throwing the results off.
I've done that. Column by column. I've even added a column that already existed (ADDR_STR_LINE1) as ORIGIN_STR_LINE1 and just aliased it, knowing that ADDRR_STR_LINE1 had already worked and both fields shared the exact datatypes/lengths etc. I just ran it with this code:SELECT PIDM, ORIGIN_STR_LINE1, ORIGIN_STR_LINE2, ORIGIN_STR_LINE3, ORIGIN_CITY, ORIGIN_COUNTY, ORIGIN_NATION, ORIGIN_STATE, ORIGIN_ZIPCODE FROM ODSMGR.RECRUIT_PERSON_OSU and it returned 1m records.
While little, consolation, you hitting all the troubleshooting steps I'd employ. I suppose the next item I would try to rule out is some bizarre row width issue/bug. Add a new data flow. As your source query, take one of your varchar2(4000) fields and duplicate it 60 times i.e. SELECT ADDR_STR_LINE1 AS Col0, ADDR_STR_LINE1 AS Col1, ..., ADDR_STR_LINE1 As Col59 FROM Owner.Table and connect that to a Derived Column task (it doesn't need to do anything, just serve as an anchor point) and run it. Do you get 1M or 28k?
Adding more of my troubleshooting steps. 1) Created a view off the original table, casting all of the fields that would need to be truncated as VARCHAR(proper length based on dest table). 2) Added/substracted fields piecemeal, until I thought I had a stable query, knowing that if I added <this fields>, <this many rows> would be dropped. But, for instance, I added PRIOR_COLLEGE_CITY and the first time, my counts dropped from 1063202 to 952755, but then later, I ran it again, and the counts dropped from 1063202 to 953989, so even if it was a data issue (it's not) it's not a consistent one.
Once I got my 953989 rows into the destination table, I compared which PRIOR_COLLEGE_CITY records were missing. In the Source Data Flow, I explicitly queried for those records, and they loaded fine, so again, not a data issue.
According to the picture you provided, when Source component output the records, some records have lost, so we could determine that this problem occurs in Source component.
In this case, please try to check the following thing in your case.
1.Run the query(not the views but the query inside the view) in your Oracle environment when execute the query in Source component, then check whether the number of records(returned from Oracle environment)is equal to the number of records(returned from SSIS Source component). Do this on a separate data task.
2. Check if there are some changes on the source table.
3. If the returned results is correct when running the query in Oracle environment, please try to compare the correct results with the SSIS Source returned results, and analyze the missing data.
I had a similar problem, mostly with odbc driver for oracle!
The problem not only lies on the volume of or rows that returns but in my case, for some reason it grouped the values of the first column also.
The only solution I ve found is to use another driver besides odbc and oledb.
Using the native Oracle Destination and Oracle Source in VS2017 it worked perfect and also the performance was better than odbc and ole db.
enter image description here
I was having a similar issue: 1,470,491 rows in the Oracle view that I was querying, all would come across when I run the package in Visual Studio, but only 377,257 rows would be read when I ran the package from SQL Agent. I tried the SQLChick "UseSessionFormat" hack that you mentioned. While editing the connection string used by the job (it comes in from configuration) I noticed that the connection string in the package had a "USERNAME" parameter as well as a "user id" paramter, but the configuration used by SQL Agent only had "USERNAME". I added "user id" parameter to the configuration used by SQL Agent and after that, the job retrieves all 1,470,491 rows.
I'm following a tutorial on Azure Data Factory migration from Azure SQL to Blob through pipelines. While most of the concepts make sense, the 'Copy Data' query is a bit confusing. I have a background in writing Oracle SQL, but Azure SQL on ADF is pretty different and I'm struggling to find specific technical documentation, probably because it's not widely adopted yet.
Pipeline configuration shown below:
Query is posted below:
SELECT data_source_table.PersonID,data_source_table.Name,data_source_table.Age,
CT.SYS_CHANGE_VERSION, SYS_CHANGE_OPERATION
FROM data_source_table
RIGHT OUTER JOIN CHANGETABLE(CHANGES data_source_table,
#{activity('LookupLastChangeTrackingVersionActivity').output.firstRow.SYS_CHANGE_VERSION})
AS CT ON data_source_table.PersonID = CT.PersonID
WHERE CT.SYS_CHANGE_VERSION <=
#{activity('LookupCurrentChangeTrackingVersionActivity').output.firstRow.CurrentChangeTrackingVersion}
Output to the sink Blob as a result of the 'Copy Data' query:
2,name2,14,4,U
7,name7,51,3,I
8,name8,31,5,I
9,name9,38,6,I
Couple questions I had:
There's a lot of external referencing from other activities in the 'Copy Data' query like #{activity('...').output.firstRow.CurrentChangeTrackingVersion. Is there a way to know the appropriate syntax to referencing external activities? Can't find any good documentation the syntax, like what .firstRow is or what the changetable output looks like. I can't replicate this query in SSMS, which makes it a bit of a black box for me.
SYS_CHANGE_OPERATION appears in the SELECT with no table name prefix. Is this directly querying from the table in SourceDataset? (It points to data_source_table, which has table tracking enabled) My main confusion stems from how table tracking information is stored in the enabled tables. Is there a way to show all the table's tracked changes in SSMS? I see some documentation on what the return values, but it's hard for me to visualize it without seeing it on the table, so an output query of some return values would be nice.
LookupLastChangeTracking activity queries in all rows from a table (which when I checked, is just one row), but LookupCurrentChangeTracking activity uses a CHANGE_TRACKING function to pull the version of the data sink in table_store_ChangeTracking_version. Why does it use a function when the data sink's version is already recorded in table_store_ChangeTracking_version?
Sorry for the many questions, but I can't find any way to make this learning curve a bit less steep. Any guides or resources would be awesome!
There is an article to get the same thing done from the UI and it will help you understand it better .
https://learn.microsoft.com/en-us/azure/data-factory/tutorial-incremental-copy-change-tracking-feature-portal .
1 . These are the Lookup activity ,. very straight forward , please read about them here .
https://learn.microsoft.com/en-us/azure/data-factory/control-flow-lookup-activity
2.SYS_CHANGE_OPERATION is a column on data_source_table and so that should be fine . Regarding the details on the how the change tracking (CT) is stored , I am not sure if all the system table are exposed on Azure SQL , but we did had few table on the on-prem version of the SQL which could be queried if needed . But for this exercise I think that will be an over kill .
I want to display all audit history data as per MS CRM format.
I have imported all records from AuditBase table from CRM to another Database server table.
I want this table records using SQL query in Dynamics CRM format (as per above image).
I have done so far
select
AB.CreatedOn as [Created On],SUB.FullName [Changed By],
Value as Event,ab.AttributeMask [Changed Field],
AB.changeData [Old Value],'' [New Value] from Auditbase AB
inner join StringMap SM on SM.AttributeValue=AB.Action and SM.AttributeName='action'
inner join SystemUserBase SUB on SUB.SystemUserId=AB.UserId
--inner join MetadataSchema.Attribute ar on ab.AttributeMask = ar.ColumnNumber
--INNER JOIN MetadataSchema.Entity en ON ar.EntityId = en.EntityId and en.ObjectTypeCode=AB.ObjectTypeCode
--inner join Contact C on C.ContactId=AB.ObjectId
where objectid='00000000-0000-0000-000-000000000000'
Order by AB.CreatedOn desc
My problem is AttributeMask is a comma separated value that i need to compare with MetadataSchema.Attribute table's columnnumber field. And how to get New value from that entity.
I have already checked this link : Sql query to get data from audit history for opportunity entity, but its not giving me the [New Value].
NOTE : I can not use "RetrieveRecordChangeHistoryResponse", because i need to show these data in external webpage from sql table(Not CRM database).
Well, basically Dynamics CRM does not create this Audit View (the way you see it in CRM) using SQL query, so if you succeed in doing it, Microsoft will probably buy it from you as it would be much faster than the way it's currently handled :)
But really - the way it works currently, SQL is used only for obtaining all relevant Audit view records (without any matching with attributes metadata or whatever) and then, all the parsing and matching with metadata is done in .NET application. The logic is quite complex and there are so many different cases to handle, that I believe that recreating this in SQL would require not just some simple "select" query, but in fact some really complex procedure (and still that might be not enough, because not everything in CRM is kept in database, some things are simply compiled into the libraries of application) and weeks or maybe even months for one person to accomplish (of course that's my opinion, maybe some T-SQL guru will prove me wrong).
So, I would do it differently - use RetrieveRecordChangeHistoryRequest (which was already mentioned in some answers) to get all the Audit Details (already parsed and ready to use) using some kind of .NET application (probably running periodically, or maybe triggered by a plugin in CRM etc.) and put them in some Database in user-friendly format. You can then consume this database with whatever external application you want.
Also I don't understand your comment:
I can not use "RetrieveRecordChangeHistoryResponse", because i need to
show these data in external webpage from sql table(Not CRM database)
What kind of application cannot call external service (you can create a custom service, don't have to use CRM service) to get some data, but can access external database? You should not read from the db directly, better approach would be to prepare a web service returning the audit you want (using CRM SDK under the hood) and calling this service by external application. Unless of course your external app is only capable of reading databases, not running any custom web services...
It is not possible to reconstruct a complete audit history from the AuditBase tables alone. For the current values you still need the tables that are being audited.
The queries you would need to construct are complex and writing them may be avoided in case the RetrieveRecordChangeHistoryRequest is a suitable option as well.
(See also How to get audit record details using FetchXML on SO.)
NOTE
This answer was submitted before the original question was extended stating that the RetrieveRecordChangeHistoryRequest cannot be used.
As I said in comments, Audit table will have old value & new value, but not current value. Current value will be pushed as new value when next update happens.
In your OP query, ab.AttributeMask will return comma "," separated values and AB.changeData will return tilde "~" separated values. Read more
I assume you are fine with "~" separated values as Old Value column, want to show current values of fields in New Value column. This is not going to work when multiple fields are enabled for audit. You have to split the Attribute mask field value into CRM fields from AttributeView using ColumnNumber & get the required result.
I would recommend the below reference blog to start with, once you get the expected result, you can pull the current field value using extra query either in SQL or using C# in front end. But you should concatenate again with "~" for values to maintain the format.
https://marcuscrast.wordpress.com/2012/01/14/dynamics-crm-2011-audit-report-in-ssrs/
Update:
From the above blog, you can tweak the SP query with your fields, then convert the last select statement to 'select into' to create a new table for your storage.
Modify the Stored procedure to fetch the delta based on last run. Configure the sql job & schedule to run every day or so, to populate the table.
Then select & display the data as the way you want. I did the same in PowerBI under 3 days.
Pros/Cons: Obviously this requirement is for reporting purpose. Globally reporting requirements will be mirroring database by replication or other means and won't be interrupting Prod users & Async server by injecting plugins or any On demand Adhoc service calls. Moreover you have access to database & not CRM online. Better not to reinvent the wheel & take forward the available solution. This is my humble opinion & based on a Microsoft internal project implementation.
Hey StackOverflow community,
My question is as follows:
I have a table, say USER_ADDR with a bunch of columns in one database, say DB001
I need to copy the contents of this table(based on a criteria) to a similar table USER_ADDR (same name, yes) in another database DB002 with a different userID and pwd.
I need to do this in a stored procedure that will be executed using a .net framework.
I tried this:
INSERT INTO "DB002".USER_ADDR (--column names--)
SELECT *
FROM "DB001".USER_ADDR
WHERE ID = "APPLICATION_NO_IN";
I get:
0: Error occurred: [IBM][DB2/NT64] SQL0204N "DB002.USER_ADDR" is an undefined name. LINE NUMBER=15. SQLSTATE=42704 : -204: IBM.Data.DB2: 42704
What am I doing wrong?
Thanks in advance
Vashist
i'm deleting my other answer after seeing the additional info about your use case. Load is mainly for bulk loads of large numbers of records.
in this case i'd recommend you do something like open connection1 in .Net to your data source, select the data and hold it in a .Net DataTable. If required, you can do that select in a stored proc that returns either individual column values for a single row or return a cursor (rowset) that contains all the columns (and rows). Then in .Net open connection2 and insert the data from the DataTable to your destination. Again, that can be done with a stored proc.
Another approach is using an external script that connects to both databases.
From just one database is not possible, at least you use, as already mentioned, Information integration (federation) or by exporting the data and then loading it.
I'm trying to export some tables from SQL Server 2005 and then create those tables and populate them in Oracle.
I have about 10 tables, varying from 4 columns up to 25. I'm not using any constraints/keys so this should be reasonably straight forward.
Firstly I generated scripts to get the table structure, then modified them to conform to Oracle syntax standards (ie changed the nvarchar to varchar2)
Next I exported the data using SQL Servers export wizard which created a csv flat file. However my main issue is that I can't find a way to force SQL Server to double quote column names. One of my columns contains commas, so unless I can find a method for SQL server to quote column names then I will have trouble when it comes to importing this.
Also, am I going the difficult route, or is there an easier way to do this?
Thanks
EDIT: By quoting I'm refering to quoting the column values in the csv. For example I have a column which contains addresses like
101 High Street, Sometown, Some
county, PO5TC053
Without changing it to the following, it would cause issues when loading the CSV
"101 High Street, Sometown, Some
county, PO5TC053"
After looking at some options with SQLDeveloper, or to manually try to export/import, I found a utility on SQL Server management studio that gets the desired results, and is easy to use, do the following
Goto the source schema on SQL Server
Right click > Export data
Select source as current schema
Select destination as "Oracle OLE provider"
Select properties, then add the service name into the first box, then username and password, be sure to click "remember password"
Enter query to get desired results to be migrated
Enter table name, then click the "Edit" button
Alter mappings, change nvarchars to varchar2, and INTEGER to NUMBER
Run
Repeat process for remaining tables, save as jobs if you need to do this again in the future
Use the SQLDeveloper migration tools
I think quoting column names in oracle is something you should not use. It causes all sort of problems.
As Robert has said, I'd strongly advise agains quoting column names. The result is that you'd have to quote them not only when importing the data, but also whenever you want to reference that column in a SQL statement - and yes, that probably means in your program code as well. Building SQL statements becomes a total hassle!
From what you're writing, I'm not sure if you are referring to the column names or the data in these columns. (Can SQLServer really have a comma in the column name? I'd be really surprised if there was a good reason for that!) Quoting the column content should be done for any string-like columns (although I found that other characters usually work better as the need to "escape" quotes becomes another issue). If you're exporting in CSV that should be an option .. but then I'm not familiar with the export wizard.
Another idea for moving the data (depending on the scale of your project) would be to use an ETL/EAI tool. I've been playing around a bit with the Pentaho suite and their Kettle component. It offered a good range of options to move data from one place to another. It may be a bit oversized for a simple transfer, but if it's a big "migration" with the corresponding volume, it may be a good option.