Manipulating database rows with UDJC in Pentaho Kettle - database

I want to get rows from the database and then apply some java code on each row.
How do I get a single row for manipulating in User Defined Java Class step?
If someone can elucidate this scenario with an example would be very helpful.
Thanks in advance.

There are samples in $KETTLE_HOME/samples directory. It includes as well example of using UserDefinedJavaClass Step.

Related

Is there a way to export a Informatica maplet 'graphical' data to a simple csv/Excel file?

The firm I work in has a lot of data sources entering the firm database using the Informatica ETL tool, stored in maplets and other data models (sorry If I'm not using the exact terminology).
The problem is that all the business logic is stored in the 'graphical interface' and nowhere else - Every time I want to see what field goes into the target field I have to trace the inputs through the maplet and that takes a very long time.
The Question is: Is there a tool that can takes all the relationships in the Informatica maplet and somehow export them to a excel table (so I can see it all without tracing)? that way I could try to make proper documentation....
Thanks in Advance.
It's possible to export mappings or whole workflows to XML. Next, you can use this tool - it will create tables with source to target dependency for every mapping.
Keep in mind it will only map input to output, it won't extract the full logic and transformations done along the way - that would've been to complex for simple visualization.
Informatica supports exporting mapping information to Excel - just search the documentation which tells you how to do it.
However, for anything other than the simplest of mappings, what ends up in Excel is not that easy to understand. If your Informatica installation supports it, then using the lineage capabilities is a much better bet.

creating an SQL view and schema using given fields and allias

hi guys please could yo help me with the above question, I don't understand how to tackle it. I am required to create a SQL view as well as a schema based on given information (in the above picture). The main difficulty is the filter which i have never done before.

Binding/Auto-Updating charts in Word document with data from MS SQL Server

I've a task wherein I'm required to work with Microsoft Word document and database. Basically the word document has numerous charts that are created by users in two steps, first is to prepare charts in excel, then take screenshot and paste in Word document. It's tedious process as it requires re-doing charts anytime someone wants to do what-if simulations.
When I insert a chart in Word, the underlying data comes from Excel Sheet which I am able to fetch from database as 1 time operation, but it's not productive that much as users will have to open excel sheet, refresh data in sheet manually.
I tried to find different solutions but I'm blank as it's totally new work for me, there are elementary examples suggesting VSTO but I couldn't find more detailed examples specifically for charts like the scenario I've.
Has anyone tackled similar issue? If so please advise. I'm open to use either VSTO, OpenXML or even R packages that can help to auto-generate word document with updated charts.
Thank you.
I found solution by using R, it satisfies what I was looking for. A related task and steps are in my other question are here!

How can I create a crosstab/pivot table in pentaho data integration?

I would like to turn the first picture into the 2nd picture (and I'm too tired to embed the pics, sry) https://imgur.com/a/SJtNo (output: Excel)
I also dont want to use a java script.
Thanks for your help.
I think you might want to use the Row Normaliser step, there are 2 different samples in the Samples directory of your PDI (\data-integration\samples\transformations).
There is also some examples on the Pentaho wiki, which you can access through the help button from within the step itself, in any case here is the link to it: http://wiki.pentaho.com/display/EAI/Row+Normaliser
There are also quite some examples of this step and other questions such as yours that have been answered here on stack and at other several PDI dev. blogs.

Need help stringing together database processes

I need some help from those with more knowledge than I posses. I am currently trying to figure out how to get real time data from a database.
I need to be able to find the company info from the most recent licensees. So the search parameter I'm using is 2016-05-10T00:00:00.000
The full string together from the API and the search parameter can be found directly at this link:
https://www.hurl.it/?method=GET&url=https%3A%2F%2Fdata.wa.gov%2Fresource%2Fv8vv-gqqs.json&headers=%7B%22X-App-Token%22%3A[%22bjp8KrRvAPtuf809u1UXnI0Z8%22]%7D&args=%7B%22licenseeffectivedate%22%3A[%222004-07-14T00%3A00%3A00.000%22]%7D
So I'm looking to retrieve the most recently added accounts in order to verify 1. the license is active 2. the license number the contractor gives matches what the website says. I would like to figure out how to automate this so that when the newest licenses are added I'll know, and they will be extracted/downloaded into excel.
If anyone can help with this I would appreciate it very much. I also have more questions about using databases if any of you are experts in the field.
Once again, thank you!
Clay
Since your goal is to get this data into Excell, have you considered using something like our OData support instead? You could structure your query in Excel PowerBI and it'd automatically refresh the data.
Another option would be to use our CSV output type with an Excel web query. I use the IMPORTDATA(...) function in Google Sheets, which is very similar.

Resources