Pentaho "kettle.properties" file - which one am I using? - database

When one runs a job from the server (by selecting server as below), does PDI pick up the "kettle.properties" file from the server or from the local computer they are running the job from? What about the Pentaho User Console portal - where is the file being picked up from when one runs the jobs from there? Is there anyway to tell PDI which "kettle.properties" file to use?

AFAIK, there is no way to pick a kettle.properties file location from within the Spoon interface right before executing a job/transformation.
The kettle.properties file used is always linked to the instance of Kettle that executes the job/transformation.
When running a job locally with the PDI Client (Spoon), the kettle.properties file used is the one contained in the directory pointed to by the -DKETTLE_HOME JVM option (defined when running the spoon.sh or Spoon.bat launch scripts).
When running a job/transformation on the Pentaho Server (by either scheduling it explicitly on the Server from Spoon, or by running it from the PUC), the kettle.properties file used is the one located in the directory pointed to by the -DKETTLE_HOME JVM option defined when running the start-pentaho.sh or the start-pentaho.bat launch scripts.
Both the PDI Client and the Pentaho Server set the default location of KETTLE_HOME to ~/.kettle.
If you want to use a kettle.properties file located somewhere else, you will have to define the location of the Kettle Home directory yourself before starting the PDI Client or the Pentaho Server:
By setting an environment variable called KETTLE_HOME. It has to be set before running the Spoon launching scripts or the Pentaho Server launching scripts
For the Pentaho Server, you can also add the option -DKETTLE_HOME to CATALINA_OPTS (if the Pentaho Server uses Tomcat) by editing the launch script.
You can find this information on the Customize the Pentaho Server page.

Related

SSIS Project - Catalog Deployment - Environment Variable (for location) accessing file server mapped to a local Z drive - SQL Server Agent issue

I have an SSIS package that reads a number of files using a For Each Loop Container. There are a number
of parameters in this package, and in the Integration Services Catalog in the SSMS, I have created an
environment with many variables for this project/package.
There are a number of environment variables for this package. There is a particular environment variable for Source Location.
While in my DEV setting, I was able to pass the Source Location environment variable as :
C:\Data Repository\Files (in a local machine).
Everything fine. Package runs perfectly, and For Each Loop Container works reads the files.
However, in the PROD setting, I have to use a file server, mapped to a Z drive.
For example:
This PC > Data Repository (\\tordfs) (Z:) > Data Repository > X
becomes
Z:\Data Repository\X
when I copy the path.
Inside the SSIS package, I am able to set the parameter value for Source Location as Z:\Data Repository\X
and the For Each Loop Container works fine from the SSDT/Visual Studio.
Now after the SSIS package/project is deployed to the SSMS Catalog, when I feed Z:\Data Repository\X as a value for the Source Location environment variable, and I Execute the package manually from the Catalog, it works fine.
However, when I use the SQL Server Agent for the above process, I get the following error:
For Each Loop Container:Warning: The For Each File enumerator is empty.
The For Each File enumerator did not find any files that
matched the file pattern, or the specified directory was
empty.
Is there anything I need to do in the For Each Loop Container or the SSIS Catalog to eliminate the above error during execution from the Catalog using SQL Server Agent?
Let me know.
In Windows mapped drives are user-specific. So you would have to map the drive for the account running the package. Instead use a UNC Path in both cases, and not a drive letter.
So something like:
\\tordfs\Data Repository\Files
The account running the package will still need permissions to the share, and permissions to the folder, but won't need a drive letter mount.
I have 2 suggestions:
Try giving read/write permissions to the SQL Database Engine Service account NT SERVICE\MSSQL$<Instance Name> (Where <Instance Name> should be replaced by the installed instance name):
Configure File System Permissions for Database Engine Access
Try to Map the Z:\ network drive within SQL Server:
Make Network Path Visible For SQL Server Backup and Restore in SSMS
Thanks a lot guys. Appreciate it.
I think I have fixed the issue:
In the environment variable, you cannot have Z:\Data Repository\X
The variable must have the values such as this:
\\tordfs\Data Repository\Data Repository\X
While manual execution from SSMS Integration Services Catalog can accept Z:\Data Repository\X as a value of an environment variable,
the SQL Server Agent needs \\tordfs\Data Repository\Data Repository\X
If the SQL Server Agent when reading from the environment in Catalog reads Z:\Data Repository\X,
I get the For Each Loop Container posted above!
This said, I am using a proxy for the SQL Server Agent to resolve other access issues such as moving a file into a folder using the File System Task.

SSIS File System task didn't copy files from the source server location when scheduled

I'm new to SSIS and stuck with a problem and hope some of them would have already gone through any of this.
Task:
Copying files from a remote server to a local machine folder using File System task and For each loop container.
Problem:
The job executes i.e. files are getting copied successfully when I execute from the SSIS designer but when deployed the project on the SQL server instance it isn't copying any files in fact the target folder is totally empty.
I'm not understanding this strange behavior. Any inputs would be of great help!
Regards-
Santosh G.
The For each loop will not error out if it doesn't find any files.
The SQL Agent account may not have access to read the directory contents.
Check is your path a variable - is it being set by a config or /SET statement?
Can you log the path before starting the for loop?
Can you copy a dummy file and see can SSIS see this file?
How are you running the job - cmd_exec() can give spurious results with file I/O tasks
The issue was related to the user authorizarions of the SQL Server agent service.
When I execute the job from SQL Server it uses agent service and for that agent service you need to assign a service user who has access rights to the desired file path.

Informatica Cloud - Picking up files from SFTP and inserting records in Salesforce

Our objective is as follows
a) Pick up a file "Test.csv" from a Secure FTP location.
b) After picking up the file we need to insert the contents of the file into an object in Salesforce.
I created the following connection for the Remote SFTP (the location which will contain "Test.csv")
Step 1
This is as shown below
Step 2
Then I started to build a Data Synchronization Task as below
What we want is for the Informatica Cloud to connect to the secure FTP location and extract the contents from a .csv from that location into our object in Salesforce.
But as you can see in Step 2, it does not allow me to choose .csv from that remote location.
Instead the wizard prompts me to choose a file from a local directory (which is my machine ...where the secure agent is running) and this is not what I want
What should I do in this scenario ?
Can someone help ?
You can write a UNIX script to transfer the file to your secure agent and then use informatica to read the file. Although, I have never tried using sftp in cloud, I have used cloud and I do know that all files are tied up to the location of the secure agent( either server or local computer) .
The local directory is used for template files. The idea is that you set up the task using a local template and then IC will connect to the FTP site when you actually run the task.
The Informatica video below shows how this works at around 1:10:
This video explains how it works at around 1:10:
http://videos.informaticacloud.com/2FQjj/secure-ftp-and-salesforececom-using-informatica-cloud/
Can you elaborate the Secure agent OS as in Windows or Linux.
For Windows environment you will have to call the script using WINSCP or CYGWIN utility I recommend the former.
For Linux the basic commands in script should work.

SSIS package executed in Server agent doesn't do its work (even while reporting success)

I have to say that I hate myself for such general question as "What I am doing wrong?" but I simply have no idea what can be the problem:
I've created SSIS package that takes the data from flat files (CSV), counts the average on one of the columns, groups by date and writes it to the database and deletes the original file. All works fine when executed within SSIS, but when I am scheduling it within Server Agent it simply doesn't work - log reports success but there is no new data in the database and the .csv file exists in its original location.
I know the problem with protection level set up in SSIS, so I've changed it to "EncryptAllWithPassword" and I use the same password with Server Agent.
Here is a link to the Server Agent Job script (created as "script job as DROP and CREATE")
Edit: Just to make things weirder, using
dtexcec /f {filepath} /de {password}
executes program without problem. I know I can shedule such command in the Windows itself, but i'd like to keep all scheduled jobs in one place - in the Server Agent
EDIT: Solved by changing the path to UNC
There are two important things to remember when setting up packages to run via a SQL Server Agent job.
Use UNC paths for all file locations, no matter how simple. There is a high probability that the server will have a different view of the file structure than your development machine, so UNC paths ensure that both machines are referencing the same paths.
Use a proxy account to execute that package, as described here http://www.mssqltips.com/sqlservertip/2163/running-a-ssis-package-from-sql-server-agent-using-a-proxy-account/.
The proxy account must have access to the physical paths and the server objects.
This also allows for security stratification on your various packages (not all packages need access to everything).

SSIS - Execute Package Task - file or SQL Server

What is the best approach to execute a package within another package?
1.From SQL Server?
In this case, I have to deploy the child package everytime the master package is executed
2. From file
In this case, I am forced to deploy all packages as files (not to SQL Server). Then local package path will not be the same with the package path from the server...
I prefer using from file.
This allows me to use source control for a way to deploy the files. Also in SQL 2012 and higher you can actually do DIFFs on SSIS Package Files.
If you want to try and keep the path the same, maybe you could try a mapped directory on your localhost. That way you could for example create a E: drive that maps to a location on C. This will allow you to keep in sync between local and server locations.

Resources