Characters are converted in special symbols - database

I have database records available in MSExcel file. I save it as CSV file. And then create database in firefox's SQLiteManager by importing that CSV file .
But the characters like ..., ' , ",- are converted in �.
I have also tried to save CSV file in UTF-8 formate, but it converts that characters in Õ
Has anyone idea , how to solve it?
Thanks.

Perhaps you might want to consider escaping quotes, e.g. try "" or "' in your csv file. And just pay a bit more attention to Fields enclosed by section in SQLiteManager add-on, making sure these fields are enclosed properly.

Related

Unable to create directory in oracle 12c

I am using Oracle 12.2 .I wish to import data pump files. To do that, I wish to create a directory, containing the files and then import. I use the following command to create directory
CREATE DIRECTORY dpump_dir1 AS ‘D:\dumpdir’;
I am getting the error as
SQL Error: ORA-00911: invalid character
00911. 00000 - "invalid character"
*Cause: identifiers may not start with any ASCII character other than
letters and numbers. $#_ are also allowed after the first
character. Identifiers enclosed by doublequotes may contain
any character other than a doublequote. Alternative quotes
(q'#...#') cannot use spaces, tabs, or carriage returns as
delimiters. For all other contexts, consult the SQL Language
Reference Manual.
Could anybody tell me what is going wrong?
The quotes being used in the code you provided are not simple straight single quotes; it's slightly easier to see when formatted as code:
CREATE DIRECTORY dpump_dir1 AS ‘D:\dumpdir’;
You can also use your text editor or dump the string to see which chraacters it contains:
select dump(q'[CREATE DIRECTORY dpump_dir1 AS ‘D:\dumpdir’;]', 1016) from dual;
DUMP(Q'[CREATEDIRECTORYDPUMP_DIR1AS‘D:\DUMPDIR’;]',1016)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Typ=96 Len=49 CharacterSet=AL32UTF8: 43,52,45,41,54,45,20,44,49,52,45,43,54,4f,52,59,20,64,70,75,6d,70,5f,64,69,72,31,20,41,53,20,20,e2,80,98,44,3a,5c,64,75,6d,70,64,69,72,e2,80,99,3b
You can see that it's reported at 49 bytes despite being 45 characters long, indicating you have multibyte characters. Before the final semicolon, which is shown as 3b, you have the sequence e2,80,99 which represents the ’ right single quotation mark, and a bit earlier you have the sequence e2,80,98 which represents the ‘ left single quotation mark.
If you use plain quotes it should work:
CREATE DIRECTORY dpump_dir1 AS 'D:\dumpdir';
Presumably you copied and pasted the text from an editor which helpfully substituted curly quotes.

How can I transform data from (.ddm .pnt .fdt .bin) files to .csv

I have data stored in .ddm, .pnt, .fdt and .bin files.
How can I export (or extract or transform) data from those file formats into .csv?
I think it's an ADABAS database.
Yes. The file extensions looks like an adabas database.
You need an adabas/natural enviroment for running database and you can write a simple program in Natural for read database content and put them to text "work file" with delimiters ";" and csv extensions. J don't met any tool for manual unpack database files.
As peterozgood pointed out, you would normally use Natural for that.
If you're using Natural on Windows or Unix you can code the following
DEFINE WORK FILE nn TYPE 'CSV'
...where nn is a number between 1 & 32, identifying the desired workfile.
(this may also be specified by your Admin in the so-called Natparm, along with Codepage & Delimiter)
Then you can output data to the file by coding
WRITE WORK FILE nn operand1 ... operandN
Natural will automatically create the csv.
Fields will be separated by the delimiter and quoted and escaped as necessary.
(the delimiter may be specified in the Natparm or as a startup parameter)
Unfortunately this functionality is not available with Mainframe Natural.
(CSV that is. Workfiles are of course available)

SSIS - Flat File with escape characters

I have a large flat file I'm using to recover data. It was exported from a system using double quotes " as the qualifier and a pipe | a the delimiter. SSIS can be configured to this without a problem, but where I'm running into issues is with the \ escape char.
the row causing the issue:
"125004267"|"125000316"|"125000491"|"height"|"5' 11\""|"12037"|"46403"|""|"t"|""|"2012-10-01 22:34:01"|"2012-10-01 22:34:01"|"1900-01-01 00:00:00"
The fourth column in the database should be 5' 11".
I'm getting the following error:
Error: 0xC0202055 at Data Flow Task 1, Flat File Source [2]: The column delimiter for column "posting_value" was not found.
How can I tell SSIS to handle the \ escape character?
I know this is quite old, but I just ran into a similar issue regarding escaping quotes in CSV's in SSIS. It seems odd there isn't more flexible support for this but it does support VB-style double-double quotes. So in your example you could pre-parse the file to translate it into
"125004267"|"125000316"|"125000491"|"height"|"5' 11"""|"12037"|"46403"|""|"t"|""|"2012-10-01 22:34:01"|"2012-10-01 22:34:01"|"1900-01-01 00:00:00"
to get your desired output. This at least works on Sql Server 2014.
This also works for Excel (tested with 2010). Though, oddly, only when inserting data from a text file, not when opening a CSV with Excel.
This does appear to be the standardized method according to RFC 4180
which states
Fields containing line breaks (CRLF), double quotes, and commas
should be enclosed in double-quotes
...
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote.
This probably isn't the answer you are looking for, but...
I'd reach out to the technical contacts of the source of data, and explain to them that if they're going to send you a file that uses double-quotes as text qualifiers, then that implies that there are never any double-quotes in the text. If that is possible, as it happens here, tell them to use another text qualifier, or none at all.
Since there are pipe delimeters in use, what's the point of having text qualifiers?
Seems redundant.

Which is the best character to use as a delimiter for ETL?

I recently unloaded a customer table from an Informix DB and several rows were rejected because the customer name column contained non-escaped vertical bars (pipe symbol) characters, which is the default DBDELIMITER in the source db. I found out that the field in their customer form has an input mask allowing any alphanumeric character to be entered, which can include any letters, numbers or symbols. So I persuaded the user to run a blanket update on that column to change the pipe symbol to a semicolon. I also discovered other rows containing asterisks and commas in different columns. I could imagine what would happen if this table were to be unloaded in csv format or what damage the asterisks could do!
What is the best character to define as a delimiter?
If tables are already tainted with pipes, commas, asterisks, tabs, backslashes, etc., what's the best way to clean them up?
I have to deal with large volumes of narrative data at my job. This is always a nightmare because users are apt to put ANY character in there, including unprintable characters. You can run a cleanup operation, but you have to do it every time you load data, and it likely won't work forever. Eventually someone will put in what every character you choose as a separator, which is not a problem if your CSV handling libraries can handle escaping properly, but many can't. If this is a one time load/unload, you're probably fine, but if you have to do it more often....
In the past I've changed the separator to the back-tick '`', the tilde '~', or the caret '^'. All failed in the current effort. The best solution I could come up with is to not use CSV format at all. I switched to XML. Even so there were still XML illegal characters, but these can be translated out with atlassian-xml-cleaner-0.1.jar.
Unload customer table with default pipe; string search for a character that doesn't exist. ie. "~"
unload to file delimiter "~"
select * from customer;
Clean your file (or not)
(vi replace string):g/theoldstring/s//thenewstring/g)
or
(unix prompt) sed 's/old-char/new-char/g' fileold > filenew
(Once clean id personally change back "~" in unload file to "|" or "," as csv standard)
Load to source db.
If you can, use a multi-character delimiter. It can still fail, but it should be much more highly unlikely.
Or, escape the delimiter while writing the export file (Informix docs say "LOAD TABLE" escapes by prefixing delimiter characters with backslash). Proper CSV has quoting and escaping so it shouldn't matter if a comma is in the data, unless your exporter and loader cannot handle proper CSV.

How to bake UTF-8 files with CakePHP's console?

For some reason, every file that I bake with CakePHP's console is regarded as ISO-8859-1 encoded by my IDE Dreamweaver. This works fine up to the point where I end up typing a special character, which will be wrongly displayed by the browser, since its encoding (by the editor) differs from the overall rendering.
How can I force the console to produce UTF-8 files, with a BOM if necessary?
I've already tried converting the template files that are used to bake the standard scaffolding pages, but with no luck.
I have the same problem - baked files are NOT UTF-8 but ASCII. (use notepad++ editor which allows easily convert, save files in another format).
Once bake generates files I have to convert them to UTF-8 one by one, to be able to work with Polish local characters.
I tried changing template files to UTF-8 but somehow this does not help. This may have something to do with the fact that the default file does not contain any non ascii character, therefore even if saved as UTF they stay ASCII.
The simplest way I found to overcome this is to modify template file eg.
cake\console\templates\default\classes\model.ctp
to include utf-8 character somewhere, e.g.:
//'message' => 'Your custom message here ł',
(notice last non ASCII character at the end of line.
then converting and saving as UTF-8 makes sure template file is utf-8.
now, model files are generated as UTF-8.
The baked files are UTF-8, or rather, they only contain basic ASCII characters which are identical to the basic UTF-8 range, so can be regarded as either. It's Dreamweaver's problem, not a problem with bake. Check the Dreamweaver settings (or code in a decent editor ;-P).
You do not want to include a BOM, it'll screw you over later.
Use the Bake_UTF8 plugin =]
http://www.github.com/pedroelsner/bake_utf8
I hope this be helpfull.
Pedro Elsner
Another way to achieve this is to open PHP files that are producing UTF-8 content (without BOM) and then saving them in format UTF-8 with BOM using Notepad++ (Encoding->Encode in UTF-8)
In my case I had Excel CSV file:
/patients/exportFirstReport/atskaite1-25-10-2013.csv
Then I had to convert encoding of PHP files down the stack:
\index.php
\app\Controller\PatientsController.php
\app\View\Patients\csv\export_first_report.ctp
\app\View\Layouts\csv\default.ctp
After conversion of encoding of these files it produces readable UTF8 excel files

Resources