l need help extracting multivalue fields which are in xml format from db2 for storage into sql server relational database. l tried using Talend Studio for extracting but the multivalue fields cant be extracted as they have data type object which cannot be converted to a string. How best can l achieve this.
There are fields like BALANCE_16 and other numbered fields which have multi-values which are not reflected on other fields. How do l solve this?
Related
I am trying to insert records from Oracle to Postgresql. To do it, I use
QueryDatabaseTableRecord -> PutDatabaseRecord
QueryDatabaseTableRecord -> Fetch from Oracle
Writer -> CSV
PutDatabaseRecord -> to insert record to Postgresql.
Reader -> CSV
A few weeks ago, I faced with the same issue with Postgresql. cloudera question.
This time I made schema to public and Translate field name : false
I have changed postgresql table columns into block letters as I have used in oracle.
I found the solution for this. Its not directly related Apache-NiFi, kind of Postgresql related thing.
Data taken from Oracle comes with Upper-case headers. Headers MUST be converted to Lower-case. Creating postgresql columns with Upper-case wont solve this issue.
To do this, I have used Replace Text processor.
Search Value : MY_COLUMN1,MY_COLUMN2
Replacement Value : my_column1,my_column2
I hope this will help someone who is trying to get data from Oracle and put them back into Postgresql.
I have JSON in a field, but I need to check it's schema before I process it. I need to know if anything has been added or removed from the schema.
Is there a way to extract the JSON schema from a JSON string so I can compare it to a known schema?
An online example is http://jsonschema.net/, but I want to do the same thing in TSQL
SQL Server don't support any json schema binding.
If your JSON is simple or flat, you can use
SELECT [key] FROM OPENJSON(#json)
to find all keys on the first level and compare them with some expected key set.
I have my Database running on a SQL Server 2012. One Column of my Table contains RTF Text. The Datatype of the Column is nvarchar(MAX).
I want setup a full text search for this column which analyses the rtf and searches only in the real text, so that I don't get rtf Tags as result.
As I understand, parsing rtf should already be part of the SQL Server. But I don't get it working :-(
I did following:
Create a full text catalog
Select the column containing rtf and add a full_text Index
But I still get wrong results
SELECT * FROM myTable WHERE
CONTAINS(myRtfColumn,'rtf')
--> still get all columns, as 'rtf' is a keyword
Any Ideas what I doing wrong? Do I have to activate rtf-Search for my SQL Server or something similar?
A full text search works only on text columns. You are inserting into your database binary stuff -> rtf. When you have chosen nvarchar you told the sql server you want to store text, but you are storing binary stuff. For binary stuff use varbinary(max) instead.
The problem will still remain, because the index routines don't know how to interpret richtext - what are control chars what is content.
let us talk about the interpreter/filter
documentation says:
https://technet.microsoft.com/en-us/en-en/library/ms142531(v=SQL.105).aspx
varbinary(max) or varbinary data
A single varbinary(max) or varbinary column can store many types of documents. SQL Server 2008 supports any document type for which a filter is installed and available in the operative system. The document type of each document is identified by the file extension of the document. For example, for a .doc file extension, full-text search uses the filter that supports Microsoft Word documents. For a list of available document types, query the sys.fulltext_document_types catalog view.
Note that the Full-Text Engine can leverage existing filters that are installed in the operating system. Before you can use operating-system filters, word breakers, and stemmers, you must load them in the server instance, as follows:
Finally todo:
check if ".rtf" is as filter available.
EXEC sp_help_fulltext_system_components 'filter';
then add a calculated column to you table "typ" which always returns ".rtf"
alter table yourname add [Typ] AS (CONVERT([nvarchar](8),'.rtf',0));
This can used now for the index as type specification.
I have stored a number of binary files in a SQL Server table. I created a full-text-index on that table which also indexes the binary field containing the documents. I installed the appropriate iFilters such that SQL Server can also read .doc, .docx and .pdf files.
Using the function DATALENGTH I can retrieve the length/size of the complete document, but this also includes layout and other useless information. I want to know the length of the text of the documents.
Using the iFilters SQL Server is able to retrieve only the text of such "complicated" documents but can it also be used to determine the length of just the text?
As far as I know (which isn't much), there is no way to query document properties via FTS. I would get the word count before inserting the document into the database, then insert the count along with it, into another column in the table. For Word documents, you can use the Document.Words.Count property; I don't know what the equivalent mechanism is for PDF documents.
At the moment, if I save <element></element> to a SQL Server 2008 database in a field of type xml, it converts it to <element/>.
How can I preserve the xml empty text as is when saving?
In case this is a gotcha, I am utilising Linq to Sql as my ORM to communicate to the database in order to save it.
What you're asking for is not possible.
SQL Server stores data in xml columns as a binary representation, so any extraneous formatting is discarded, as you found out.
To preserve the formatting, you would have to store the content in a text field of type varchar(MAX) or nvarchar(MAX). Hopefully you don't have to run XML-based queries on the data.
http://msdn.microsoft.com/en-us/library/ms189887.aspx