Data Integragration with Pentaho Kettle - database

I have three input components:
Excel Input
XML Input
Table Input
I would like to do some validation and verification before inserting the integrated data into a data base.
Which component should I use for the data transformation/verification?

Try using the Data Validator step in PDI. Here you can validate the rows that are coming in from the Input step either using the inbuilt validator or even using regular expressions.
Hope it helps :)

There are several aproaches to achieve the verification. The easiest one is to code your validations using the "Modified java script value" step, but it's the most resource hungry, so Pentaho recommends avoiding it if you can. So you can use others steps, like the ones under the "flow" category.

Related

How to extract the data from IBMKeyword in UiPath?

I am experimenting with the IBM Watson NLU’s Text Analysis package in UiPath with a simple text. I am able to extract the KeyValue pair information for Categories, Concept, and Sentiments using .ToString() . However, I am having trouble in figuring out how to extract information for Keywords, Entity both are of type IBMKeyword, IBMEntity
A simple .ToString() method in the message box gives something that's not helping or I don't know how to use it.
Below is the screenshot of my UiPath Studio:
Try this. Since the variable has multiple keywords, it cannot be printed in a single message box without a loop

How to make drill down tables in Zeppelin?

I am trying to make each value in one of the column of table as clickable so that I can develop drill down functionality using Zeppelin Table. But following sample code is not working at all.
print(s"""%table
a\tb\n%html <button>x</button>1\t2\n%html <button>y</button>3\t4
""")
It will take quite some effort to make this work.
The basic idea is converting a data source (e.g. Spark DataFrame) to a complete and self contained HTML section and which is interpreted by Zeppelin. Hide and show need to be handled by javascript library.
Zeppelin using Bootstrap, so we shall use bootstrap library directly. This SO might help Bootstrap cllapse. Perhaps need more styling.
If you are just wanna drilldown function while not strictly with table. And if you are using Spark, it might be a bit easy with spark-highcharts to implement the feature like Highcharts column drill down
Finally my code worked. The issue seems to be if you have html tag in the first column, it will not work. However, it works in all the other columns. Just add one more cols in the front and it worked.
print(s"""%table
dummy\ta\tb\np1\t%html <button>x</button>1\t2\np1\t%html <button>y</button>3\t4
""")

AngularJS ui-grid import XLSX data best approach

What would be the best approach to import XLSX data to be displayed using an AngularJS ui-grid?
Is the js-xlsx parser a good choice for this, or are there other open source XLSX parser tools better suited for this task? In my case the XLSX data is very basic, nothing complicated, but I would like to preserve the style info as much as possible. I anticipate the the data grid will be less than 20 col x 1000 rows.
Or would it be better to use an alternative data grid, such as the Hansontable, instead of ui-grid? Would that be better suited for spreadsheet data?
Importing data into the grid with js-xlsx should work fine. I've been able to get it working with my simple Open Office files so I would imagine you will be mostly OK.
Style info is another question, though. If you're wanting to maintain cell-specific backgrounds and such that could be more difficult. Can you share your specific use case that you want to handle?
For others who might be interested: once you've read a file into your browser and turned it into a workbook you can use XLSX.utils.sheet_to_json() to easily dump the spreadsheet contents into a structure you can pass into your grid. If you pass { header: 1 } as an argument to that function it will return a simple array-of-arrays of the data. The first element in the array will be your header row if you have one. You can use that to create your column definitions.
If you want to see a working plunker check this one out: http://plnkr.co/edit/rYC3nd7undqJz2mr8Old?p=preview
And if you want a more in-depth tutorial I have this post explaining SheetJS and the contents of the plunker: http://brianhann.com/easily-import-spreadsheets-into-ui-grid/

Drupal: How is this component named?

I would like to create a table that looks/behaves like the one to manage fields when editing content types.
How is this one named? Is this form API?
If you're looking for the drag and drop sorting behavior, then you should look at the documentation for drupal_add_tabledrag.
And perhaps this tutorial might help: http://aswapathy.com/d78tu/tabledrag/theme_the_form_doc
I would implement it using Forms API together with a table.
If you are new to forms API, this step by step introduction is really good:
http://drupal.org/node/262422

Solr/Lucene Query Validation

Does anyone have a regex that can be used to validate that a query to be sent to lucene is is well formatted?
https://github.com/praized/lucene-query-validator/blob/master/src/luceneQueryValidator.js
This is a JavaScript attempt. I have not verified it's success but from reviewing the code, everything looks legit.
If you're allowing your users to enter in free text, there is always the chance that they'll mistype a field name (i.e. naem:Bob instead of name:Bob ). This validator will not catch issues like that.
I've created a js AMD module here: https://github.com/grahamscott/lucene-validator-amd-module
It's based on the praized module above, but is easier to integrate client-side, and doesn't rely on window.alert()

Resources