How to Detect Errors in Apex Data Loader Batch Execution - batch-file

We have a DOS Batch job which runs a multi-step process to:
Delete all records from salesforce for a specific object (download IDs and then delete them using Data Loader)
Deletes all records from a database table which mirrors the Salesforce data.
Extracts data from a database and uploads the data to the Salesforce objects using Data Loader.
Downloads the Salesforce data into the database table.
Recently, the first step has been failing with a QUERY-TIMEOUT error. If I rerun the process, it generally works OK without any other changes. This is being investigated, but is not my question.
My question is: How can I detect when step 1 (which uses Data Loader) in the batch file fails? If this fails, I do not want to proceed with the rest of the process, as this deletes the database data which is used elsewhere for reporting.
Does the Apex Loader set an ERRORLEVEL if it fails? How else can I determine that there was a failure?
Thanks.
Ron Ventura

Please to view more detail refer to the link below. Basically is to check for the log file that the data loader generates when there was an error, so if no errors where found the log files are empty, If the pass is 100% successful, the error log will have a header line and no rows.
https://www.nimbleuser.com/blog/failing-safe-with-the-apex-data-loader-for-salesforce-crm
And also you can refer to this answer.
https://salesforce.stackexchange.com/questions/14466/availability-of-apex-data-loader-error-file-from-local-pc-to-salesforce
Regards.

Related

Rest API to Get data from ADF to Snowflake

I'm trying to monitor pipeline runs from ADF in a Snowflake table. I've managed to use a REST API to get the data into Power BI but I now need to get the data from ADF to Snowflake. Anyone have any examples that would be of great help. The data I need to get is like Pipeline name, run time, start time, error message etc.
Please check below approach.
Take Pipeline name, run time, start time, error message etc.. details into pipeline variables
Inside copy activity, Source - point to some dummy file on blob or datalake and then add additional columns for Pipeline name, run time, start time, error message etc.. in source tab.
Inside Copy activity, Sink - Point to your snowflake table.
Inside Copy activity, Mapping tab - Mapp your source and Sink columns accordingly.
Please check below video where author doing same but using dataflow. In your case you can go with copy activity as explained above.
https://www.youtube.com/watch?v=-xna7n33lmc
To know how to access error message of any activity failure. Please check video https://www.youtube.com/watch?v=_lSB7jaDnG0

Snowpipe Issue - Azure data lake storage

We're running into an issue where snowpipe is probably starting to ingest the file even before it gets fully written in azure data lake storage.
It then throws an error - Error parsing the parquet file: Invalid: Parquet file size is 0 bytes.
Here are some stats that show that file was fully written at 13:59:56 and snowflake was notified at 13:59:47.
PIPE_RECEIVED_TIME - 2021-08-06 13:59:47.613 -0700
LAST_LOAD_TIME - 2021-08-06 14:00:05.859 -0700
ADLS file last modified time - 13:59:56
Has anyone run into this issue or have any pointers for troubleshooting this?
I have seen something similar once. I was trying to funnel Azure Logs into a storage account and have them picked up. However, the built in process that wrote the logs would create a file, gradually append updates to it with new logs, and then every hour or so, cut over to a new file for more logs.
the Snowpipe would pick up the file with one log (or none) and from there, the azure queue would no longer send another event for that file so Snowflake would never query it again to process it.
So I'm wondering if your process is creating the file and then updating it. Rather than creating it with the output already fully ready to write.
If this is the issue, and you don't have control of how the file is created. you could try use a task that runs COPY INTO on a schedule (rather than a snowpipe) so that you can restrict the list of files getting copied to just files that have finished writing fully.

Test Automation - Design for Web/SOA architecture

I am finding much about test automation and web architecture using Selenium/java however I'd like to ask about another scenario.
Say you have a text file that contains customer details. A process then needs to be manually triggered that will parse that file and load the details in a database. The details are then view-able from a web page. From the web page you can further add/delete/edit/navigate records.
As a design I was thinking that I would follow this logic:
Set-up file and automatically trigger process
Automatically parse file and compare with database entries to ensure they were entered correctly.
Automatically trigger selenium test to log-in and view results in web page, hence I would have compared the file to the database and the web page to the database.
I am not sure about this approach though, and it provides for various challenges especially in terms of re-initializing state between each test. Do you think there is a better approach where ultimately I need to make sure that the details in the file end up in the right database tables/columns and that the details can be correctly seen in the web page.
Many thanks!
I think your workflow is adequate, with a small exception.
For state, think about the following from a high level concept of "phases".
Setup Phase:
Have your automation routine 'create' records for you that you will
need to use during your automated routines that will follow.
As the routines create the records for you in DB, it would probably be good
to have a column with a GUID (36-character) that you can generate. In other words, do not assume that you will create unique row id 1, row id 2, row id 3 (etc).
Since you will know this value at create-time, write a manifest file
to keep track of the DB records you will query during your test run.
Tests Run Phase:
Run your automated tests, having them utilize the manifest file to
get their ids necessary. You already mentioned this in "say you have
a text file that contains customer details".
With the IDs in the
manifest, do whatever processes you need to do for test(s).
Teardown:
Using the manifest file, find your DB records by the identifiers
(GUIDs) and perform the SQL statements to delete these records.
Truncate the manifest file so it is now empty (or you can write to it
in a non-append way for all writes, which will accomplish same goal).

Creating A Log Of Files In A Folder and update into table

Can anyone help me to build a table that lists all files in a specified folder, so whenever a file is copied to that folder the table should update and make a log of files?
I need the list to retain the names, even if the file is moved from that folder or deleted. Later the data would be deleted by a scheduler.
Also I need the table to record the time exactly when the file was copied into that folder and not the modification or creation time.
I am using windows 7; how can I build a system with my desired behaviour?
Just turn on Windows file auditing, for that folder, the youtube video takes you through the process.
Microsoft provide information on their techNet site as to how you can use the LogParser tool to extract Security events from the Event Log DB.
Note: Admin questions should really be posted to the SuperUser site.

how to index data in solr from database automatically

I have MySql database for my application. i implemented solr search and used dataimporthandler(DIH)to index data from database into solr. my question is: is there any way that if database gets updated then my solr indexes automatically gets update for new data added in the database. . It means i need not to run index process manually every time data base tables changes.If yes then please tell me how can i achieve this.
I don't think there is a possibility in Solr which lets you index the data when any updates happens to DB.
But there could be possibilities like, with the help of Triggers - there is a possibility to run an external application from triggers.
Write a CRON to trigger PHP script which does reading from the DB and indexing it in Solr. Write a trigger (which calls this script) for CRUD operation and dump it into DB, so, whenever something happens to DB, this trigger will call the above script and indexing could happen.
Please see:
Invoking a PHP script from a MySQL trigger
Automatic Scheduling:
Please see this post How can I Schedule data imports in Solr for more information on scheduling. The second answer, explains how to import using Cron.
Since you used a DataImportHandler to initially load your data into Solr... You could create a Delta Import Handler that is executed using curl from a cron job to periodically add changes in the database to the index. Also, if you need more real time updates, as #Rakesh suggested, you could use a trigger in your database and have that kick off the curl call to the Delta DIH.
you can import the data using your browser and task manager.
do the following steps on windows server...
GO to Administrative tools => task Schedular
Click "Create Task"
Now a screen of Create Task will be open with the TAB
General,Triggers,Actions,Conditions,Settings.
In the genral tab enter the task name "Solrdataimport" and in discriptions enter "Import mysql data"
Now go to Triggers tab CLick new in Setting check Daily.In Advanced setting Repeat task every ... Put time there whatever you want.click OK
Now go to Actions button click new Button IN setting put Program/Script "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" this is the installation path of chrome browser.In the Add Arguments enter http://localhost:8983/solr/#/collection1/dataimport//dataimport?command=full-import&clean=true And click OK
Using the all above process Data import will Run automatically.In case of Stop the Imort process follow the all above process just change the Program/Script "taskkill" in place of "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" under Actions Tab In arguments enter "f /im chrome.exe"
Set the triggers timing according the requirements
What you're looking for is a "delta-import", and a lot of the other posts have information about that covered. I created a Windows WPF application and service to issue commands to Solr on a recurring schedule, as using CRON jobs and Task Scheduler is a bit difficult to maintain if you have a lot of cores / environments.
https://github.com/systemidx/SolrScheduler
You basically just drop in a JSON file in a specified folder and it will use a REST client to issue the commands to Solr.

Resources