I am using mysql2sqlite.sh from script Github to change my mysql database to sqlite. But the problem i am getting is that in my table the data 'E-001' gets changed to 'E?001'.
I have no idea how to modify the script to get the required result. Please help me.
the script is
#!/bin/sh
# Converts a mysqldump file into a Sqlite 3 compatible file. It also extracts the MySQL `KEY xxxxx` from the
# CREATE block and create them in separate commands _after_ all the INSERTs.
# Awk is choosen because it's fast and portable. You can use gawk, original awk or even the lightning fast mawk.
# The mysqldump file is traversed only once.
# Usage: $ ./mysql2sqlite mysqldump-opts db-name | sqlite3 database.sqlite
# Example: $ ./mysql2sqlite --no-data -u root -pMySecretPassWord myDbase | sqlite3 database.sqlite
# Thanks to and #artemyk and #gkuenning for their nice tweaks.
mysqldump --compatible=ansi --skip-extended-insert --compact "$#" | \
awk '
BEGIN {
FS=",$"
print "PRAGMA synchronous = OFF;"
print "PRAGMA journal_mode = MEMORY;"
print "BEGIN TRANSACTION;"
}
# CREATE TRIGGER statements have funny commenting. Remember we are in trigger.
/^\/\*.*CREATE.*TRIGGER/ {
gsub( /^.*TRIGGER/, "CREATE TRIGGER" )
print
inTrigger = 1
next
}
# The end of CREATE TRIGGER has a stray comment terminator
/END \*\/;;/ { gsub( /\*\//, "" ); print; inTrigger = 0; next }
# The rest of triggers just get passed through
inTrigger != 0 { print; next }
# Skip other comments
/^\/\*/ { next }
# Print all `INSERT` lines. The single quotes are protected by another single quote.
/INSERT/ {
gsub( /\\\047/, "\047\047" )
gsub(/\\n/, "\n")
gsub(/\\r/, "\r")
gsub(/\\"/, "\"")
gsub(/\\\\/, "\\")
gsub(/\\\032/, "\032")
print
next
}
# Print the `CREATE` line as is and capture the table name.
/^CREATE/ {
print
if ( match( $0, /\"[^\"]+/ ) ) tableName = substr( $0, RSTART+1, RLENGTH-1 )
}
# Replace `FULLTEXT KEY` or any other `XXXXX KEY` except PRIMARY by `KEY`
/^ [^"]+KEY/ && !/^ PRIMARY KEY/ { gsub( /.+KEY/, " KEY" ) }
# Get rid of field lengths in KEY lines
/ KEY/ { gsub(/\([0-9]+\)/, "") }
# Print all fields definition lines except the `KEY` lines.
/^ / && !/^( KEY|\);)/ {
gsub( /AUTO_INCREMENT|auto_increment/, "" )
gsub( /(CHARACTER SET|character set) [^ ]+ /, "" )
gsub( /DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP|default current_timestamp on update current_timestamp/, "" )
gsub( /(COLLATE|collate) [^ ]+ /, "" )
gsub(/(ENUM|enum)[^)]+\)/, "text ")
gsub(/(SET|set)\([^)]+\)/, "text ")
gsub(/UNSIGNED|unsigned/, "")
if (prev) print prev ","
prev = $1
}
# `KEY` lines are extracted from the `CREATE` block and stored in array for later print
# in a separate `CREATE KEY` command. The index name is prefixed by the table name to
# avoid a sqlite error for duplicate index name.
/^( KEY|\);)/ {
if (prev) print prev
prev=""
if ($0 == ");"){
print
} else {
if ( match( $0, /\"[^"]+/ ) ) indexName = substr( $0, RSTART+1, RLENGTH-1 )
if ( match( $0, /\([^()]+/ ) ) indexKey = substr( $0, RSTART+1, RLENGTH-1 )
key[tableName]=key[tableName] "CREATE INDEX \"" tableName "_" indexName "\" ON \"" tableName "\" (" indexKey ");\n"
}
}
# Print all `KEY` creation lines.
END {
for (table in key) printf key[table]
print "END TRANSACTION;"
}
'
exit 0
I can't give a guaranteed solution, but here's a simple technique I've been using successfully to handle similar issues (See "Notes", below). I've been wrestling with this script the last few days, and figure this is worth sharing in case there are others who need to tweak it but are stymied by the awk learning curve.
The basic idea is to have the script output to a text file, edit the file, then import into sqlite (More detailed instructions below).
You might have to experiment a bit, but at least you won't have to learn awk (though I've been trying and it's pretty fun...).
HOW TO
Run the script, exporting to a file (instead of passing directly
to sqlite3):
./mysql2sqlite -u root -pMySecretPassWord myDbase > sqliteimport.sql
Use your preferred text editing technique to clean up whatever mess
you've run into. For example, search/replace in sublimetext. (See the last note, below, for a tip.)
Import the cleaned up script into sqlite:
sqlite3 database.sqlite < sqliteimport.sql
NOTES:
I suspect what you're dealing with is an encoding problem -- that '-' represents a character that isn't recognized by, or means something different to, either your shell, the script (awk), or your sqlite database. Depending on your situation, you may not be able to finesse the problem (see the next note).
Be forewarned that this is most likely only going to work if the offending characters are embedded in text data (not just as text, but actual text content stored in a text field). If they're in a machine name (foreign key field, entity id, e.g.), binary data stored as text, or text data stored in a binary field (blob, eg), be careful. You could try it, but don't get your hopes up, and even if it seems to work be sure to test the heck out of it.
If in fact that '-' represents some unusual character, you probably won't be able to just type a hyphen into the 'search' field of your search/replace tool. Copy it from the source data (eg., open the file, highlight and copy to clipboard) then paste into the tool.
Hope this helps!
To convert mysql to sqlite3 you can use Navicom Premium.
Related
I have the following code in my while loop and it is significantly slow, any suggestions on how to improve this?
open IN, "<$FileDir/$file" || Err( "Failed to open $file at location: $FileDir" );
my $linenum = 0;
while ( $line = <IN> ) {
if ( $linenum == 0 ) {
Log(" This is header line : $line");
$linenum++;
} else {
$linenum++;
my $csv = Text::CSV_XS->new();
my $status = $csv->parse($line);
my #val = $csv->fields();
$index = 0;
Log("number of parameters for this file is: $sth->{NUM_OF_PARAMS}");
for ( $index = 0; $index <= $#val; $index++ ) {
if ( $index < $sth->{NUM_OF_PARAMS} ) {
$sth->bind_param( $index + 1, $val[$index] );
}
}
if ( $sth->execute() ) {
$ifa_dbh->commit();
} else {
Log("line $linenum insert failed");
$ifa_dbh->rollback();
exit(1);
}
}
}
By far the most expensive operation there is accessing the database server; it's a network trip, hundreds of milliseconds or some such, each time.
Are those DB operations inserts, as they appear? If so, instead of inserting row by row construct a string for an insert statement with multiple rows, in principle as many as there are, in that loop. Then run that one transaction.
Test and scale down as needed, if that adds up to too many rows. Can keep adding rows to the string for the insert statement up to a decided maximum number, insert that, then keep going.†
A few more readily seen inefficiencies
Don't construct an object every time through the loop. Build it once befor the loop, and then use/repopulate as needed in the loop. Then, there is no need for parse+fields here, while getline is also a bit faster
Don't need that if statement for every read. First read one line of data, and that's your header. Then enter the loop, without ifs
Altogether, without placeholders which now may not be needed, something like
my $csv = Text::CSV_XS->new({ binary => 1, auto_diag => 1 });
# There's a $table earlier, with its #fields to populate
my $qry = "INSERT into $table (", join(',', #fields), ") VALUES ";
open my $IN, '<', "$FileDir/$file"
or Err( "Failed to open $file at location: $FileDir" );
my $header_arrayref = $csv->getline($IN);
Log( "This is header line : #$header_arrayref" );
my #sql_values;
while ( my $row = $csv->getline($IN) ) {
# Use as many elements in the row (#$row) as there are #fields
push #sql_values, '(' .
join(',', map { $dbh->quote($_) } #$row[0..$#fields]) . ')';
# May want to do more to sanitize input further
}
$qry .= join ', ', #sql_values;
# Now $qry is readye. It is
# INSERT into table_name (f1,f2,...) VALUES (v11,v12...), (v21,v22...),...
$dbh->do($qry) or die $DBI::errstr;
I've also corrected the error handling when opening the file, since that || in the question binds too tightly in this case, and there's effectively open IN, ( "<$FileDir/$file" || Err(...) ). We need or instead of || there. Then, the three-argument open is better. See perlopentut
If you do need the placeholders, perhaps because you can't have a single insert but it must be broken into many or for security reasons, then you need to generate the exact ?-tuples for each row to be inserted, and later supply the right number of values for them.
Can assemble data first and then build the ?-tuples based on it
my $qry = "INSERT into $table (", join(',', #fields), ") VALUES ";
...
my #data;
while ( my $row = $csv->getline($IN) ) {
push #data, [ #$row[0..$#fields] ];
}
# Append the right number of (?,?...),... with the right number of ? in each
$qry .= join ', ', map { '(' . join(',', ('?')x#$_) . ')' } #data;
# Now $qry is ready to bind and execute
# INSERT into table_name (f1,f2,...) VALUES (?,?,...), (?,?,...), ...
$dbh->do($qry, undef, map { #$_ } #data) or die $DBI::errstr;
This may generate a very large string, what may push the limits of your RDBMS or some other resource. In that case break #data into smaller batches. Then prepare the statement with the right number of (?,?,...) row-values for a batch, and execute in the loop over the batches.‡
Finally, another way altogether is to directly load data from a file using the database's tool for that particular purpose. This will be far faster than going through DBI, probably even including the need to process your input CSV into another one which will have only the needed data.
Since you don't need all data from your input CSV file, first read and process the file as above and write out a file with only the needed data (#data above). Then, there's two possible ways
Either use an SQL command for this – COPY in PostgreSQL, LOAD DATA [LOCAL] INFILE in MySQL and Oracle (etc); or,
Use a dedicated tool for importing/loading files from your RDBMS – mysqlimport (MySQL), SQL*Loader/sqlldr (Oracle), etc. I'd expect this to be the fastest way
The second of these options can also be done out of a program, by running the appropriate tool as an external command via system (or better yet via the suitable libraries).
† In one application I've put together as much as millions of rows in the initial insert -- the string itself for that statement was in high tens of MB -- and that keeps running with ~100k rows inserted in a single statement daily, for a few years by now. This is postgresql on good servers, and of course ymmv.
‡
Some RDBMS do not support a multi-row (batch) insert query like the one used here; in particular Oracle seems not to. (We were informed in the end that that's the database used here.) But there are other ways to do it in Oracle, please see links in comments, and search for more. Then the script will need to construct a different query but the principle of operation is the same.
I have a Filemaker Pro 12 database that I can sort and export selections using check boxes, I have a script to "Select none" and remove all the ticked items, but I would like to be able to search and then check all the results.
I have a button on the checkbox image that performs a Set field and the following:
Case (
ValueCount ( FilterValues ( Table::Checkbox ; Table::ID ) ) > 0;
Substitute ( Table::Checkbox ; Table::ID & ¶ ; "" ) ;
Table::Checkbox & Table::ID & ¶
)
Conditional formatting of the checkbox is:
not ValueCount ( FilterValues ( Table::Checkbox ; Table::ID ) ) > 0
The script for "Select none" is:
Set Field [Table::Checkbox; ""]
So what would the "Select all" script need to be?
There are quite a few methods to collect values across a found set into a return-delimited list. Looping is fine if your found set is fairly small; otherwise it may prove too slow.
Since your target is a global field anyway, you could use a simple:
Replace Field Contents [ No dialog; Table::gCheckbox; List ( Table::gCheckbox ; Table::ID ) ]
This will append the current found-set's values to the existing list. To start anew, begin your script by:
Set Field [ Table::gCheckbox; "" ]
Note:
In version 13, you can use the new summary field option of "List".
Caveat:
Make sure you have a backup while you experiment with Replace Field Contents[]; there is no undo.
You could write a script to walk through the current found set and get all of the IDs:
Set Variable [$currentRecord ; Get(RecordNumber)]
#
Goto Record [First]
Loop
Set Variable [$ids ; Table::ID & ¶ & $ids ]
Go To Record [Next ; Exit after Last]
End Loop
#
Go To Record [By Calculation ; $currentRecord]
#
Set Field [Table::Checkbox ; $ids ]
This method would save your current position, walk through the current found set, compile the ids into a variable, return you to your position, and set the checkbox field.
Table:
localization_strings = {
string_1 = "Text Here",
string_2 = "Some More Text Here",
string_3 = "More Text"
}
This is obviously not the whole table, but just a small sample. The real table is over 500+ lines. The reason I don't just redo the table is because other functions reference it and I don't have access to those files to fix them, so I have to find a work around. Also, because it would quite tedious work and can cause problems with other codes.
I have made 2 attempts at solving this problem, but I can only get one of the values I want (incorrect terminology, I think) and I need both as 1 is display text and 1 is data for a function call.
Attempts:
-- Attempt #1
-- Gives me the string_#'s but not the "Text"...which I need, as I want to display the text via another function
LocalizationUnorderedOpts = {}
LocalizationOpts = {}
for n,unordered_names in pairs(localization_strings) do
if (unordered_names) then
table.insert( LocalizationUnorderedOpts, n)
end
end
io.write(tostring(LocalizationUnorderedOpts) .. "\n")
table.sort(LocalizationUnorderedOpts)
for i,n in ipairs(LocalizationUnorderedOpts) do
if (n) then
io.write(tostring(i))
table.insert( LocalizationOpts, { text = tostring(LocalizationUnorderedOpts[i]), callback = function_pointer_does_not_matter, data = i } )
end
end
-- Attempt #2
-- Gives me the "Text" but not the string_#'s...which I need to as data to the callback to another function (via function pointer)
LocalizationUnorderedOpts = {}
LocalizationOpts = {}
for n,unordered_names in pairs(localization_strings) do
if (unordered_names) then
table.insert( LocalizationUnorderedOpts, localization_strings[n])
end
end
io.write(tostring(LocalizationUnorderedOpts) .. "\n")
table.sort(LocalizationUnorderedOpts)
for i,n in ipairs(LocalizationUnorderedOpts) do
if (n) then
io.write(tostring(i))
table.insert( LocalizationOpts, { text = tostring(LocalizationUnorderedOpts[i]), callback = function_pointer_does_not_matter, data = i } )
end
end
If I understand it correctly, you need to sort the non-array table. Your first attempt has done most of the work: build another table, which has the values the same as the keys in the original table.
What's left is how to get the original values like "Text Here", for that you need to index the original table:
for k, v in ipairs(LocalizationUnorderedOpts) do
print(v) --original key
print(localization_strings[v]) --original value
end
This is my array:
ListTabs=""
ListTabs=$ListTabs"T_Tab1\n"
ListTabs=$ListTabs"T_Tab2\n"
ListTabs=$ListTabs"T_Tab3"
echo $ListTabs
arrArr=0
OLD_IFS=$IFS;
IFS=\n
for listArr in ${ListTabs[#]};
do
#echo $listArr
MYDIR[${ARR}]=$listArr
(( arrIdx = $ARR+ 1 ))
done
IFS=$OLD_IFS;
then, i have done a sort of id from a select in this way (FILESELECT_DAT is a output file of query):
sort -u ${FILESELECT_DAT} > ${SORT_OUT1}
ok..Now i have to make a loop that for each element of array makes a SELECT where ID = values of ${SORT_OUT1}. So there are 2 loops. A while on ID and a for loop for the select. How can i loop the ID inside ${SORT_OUT1}? I think this is the begin
id=""
while read $id
do
for ListTabs in ${listArr}
do
-
-
SELECT * FROM $ListTabs(but the results is alway the first tab in each loop)
WHERE ID = ${id}(but he show me all IDs)
-
-
done < ${SORT_OUT1}
Any ideas? Thanks
listArr=( T_Tab{1,2,3} )
sort -u "$FILESELECT_DAT" > "$SORT_OUT1"
while read id; do
for ListTabs in "${listArr[#]}"; do
...
done
done < "$SORT_OUT1"
Take care that nothing in the body of the for-loop reads from standard input, or it will consume part of the input intended for the read command. To be safe, use a separate file descriptor:
while read -u 3 id; do
...
done 3< "$SORT_OUT1"
In visual foxpro, i have a cursor which is the result of a sql query, when i export the content of that cursor to a csv file using the statement :
COPY TO "c:\test.csv" type DELIMITED
all data is messed up, i do not pecify any delimiter so basically foxpro takes the default, which is every column in that cursor. bow when i run the same command to an xls file, and then convert it to a csv file...it works very well:
COPY TO "c:\test.xls" type XL5
anyone has had such issue, any one still using foxpro and doing stuff like those?
Have you tried using TYPE CSV in the COPY TO command?
Personally I never liked the built-in DBF to CSV converters. They always seemed to do things I did not want them to do. So I just wrote my own. Here is some code to get you started.
LOCAL lnFields
SELECT DBF
lnFieldCount = AFIELDS(laFields)
lnHandle = FOPEN("filename.csv", 1)
ASSERT lnHandle > 0 MESSAGE "Unable to create CSV file"
SCAN
lcRow = ""
FOR lnFields = 1 TO lnFieldCount
IF INLIST(laFields[lnFields,2], 'C', 'M')
lcRow = lcRow + IIF(EMPTY(lcRow), "", ",") + '"' + ;
STRTRAN(EVALUATE(laFields[lnFields,1]),'"', '""') + '"'
ELSE
lcRow = lcRow + IIF(EMPTY(lcRow), "", ",") + ;
TRANSFORM(EVALUATE(laFields[lnFields,1]))
ENDIF
ENDFOR
FWRITE(lnHandle, lcRow)
ENDSCAN
FCLOSE(lnHandle)