Zend Framework saving emojis as question marks in SQL Table

Zend Framework saving emojis as question marks in SQL Table - sql-server

We are (still) using ZF1, to write/read from a Microsoft SQL database.
As long as a table field is of type NVARCHAR, everything is fine, special characters (like, chinese, polish, etc.) and emojis are saved correctly, 1:1 equal to the entered text in an html form.
But when it comes to column type TEXT, there are Question marks instead of the special characters/emojis in these columns.
Funny enough that everything is fine on servers where we got PHP5.6 running, with FreeTDS and php-mssql installed on them. This problem occurs on a system where PHP7.4 is installed, using only SQLSRV as the db driver.
According to this thread, a simple N should be enough...as long as the fields are set as NVARCHAR fields. We are using the update and insert methods of Zend Framework, so we got no clue how to save all these characters into TEXT columns.
When connecting, our bootstrap file is using UTF-8 as the character set, used as an driver option.
Is there any workaround we could try to accomplish this? We cannot change the column type because there are reasons why some colums are TEXT.

Why are you using text? It (text) has been deprecated for 16 years. text, however, does not support unicode characters, that would be ntext; but that too has been deprecated for 16 years.
To store up to 2GB of characters, that need to be unicode, use nvarchar(MAX). For a column that already exists, ALTER it.
ALTER TABLE dbo.YourTable ALTER COLUMN YourColumn nvarchar(MAX);

As per OP response in the comments the database setup can be put aside given that the TEXT column type behaves as expected when using PHP 5.6 and the operation fails on the same database when using PHP 7.4.
This allows to narrow the issue to runtime PHP charset handling. Not knowing all the specifics between the upgrade of PHP 5.6 to 7.4, the issue may lie in:
Different default charset in php.ini
Different behavior in the mssql driver with charset handling
Different operating system default locale which may affect the script
Different php file encoding (the encoding of the .php file itself may alter the charset used at runtime)
Since the issue at hand cannot be unambiguously identified based on the provided information, I may only suggest to check and eliminate the different possibilities one by one.
try detecting the actual encoding of the input text using any of the mb_* functions (maybe refer to this question)
try adding a comment in your script file containing special characters and be sure to save the file as UTF8 (or desired encoding)
try setting explicitly the driver charset (refer to this question) (or maybe in the connection string, if supported)
try setting the default php locale using setlocale and other functions of the sort
Those steps may guide you through a resolution but without guarantees.

Related

How to check latest value of standard sequence in ODI repository

I have a mapping which is using the standard sequence in ODI [Oracle Data Integrator]. I want to reset the value of that particular sequence.
It says, this standard sequence stored in repository. Not sure which repository. So could you please advise which repository [MASTER, WORK or RUN] this sequence can be able to view and modify without changing in Mapping level.

Standard sequences are stored in the WORK and RUN repositories. I don't think you can reset them using ODI Studio.
Specific sequences are stored in a database table you specified.
Both types should be avoided where possible because they don't perform well. It is far better to use native sequences provided by your database when available.

Reading Excel files into SQL Server not using OLEDB/ODBC

Is there a way to read Excel 2010/2013 files natively ?
We are importing Excel files into SQL Server and have come across a specific issue whereby it looks as though the Excel driver decides the type of a destination data column depends upon testing the contents of only the first 65K odd rows.
This has only just started happening within the past 3 weeks, before then we had managed to convince Excel of the error of its ways by a simple registry hack that forced it to read the entire set of rows.
The problem is that we have some datasets that contain, say 120,000 rows and these may have all numeric values for the first 80,000, then it will have some non-numeric yet vital information that we wish to retain.
Yes, the data is not correctly typed, we know.
Because the source data type has been determined by the Excel driver to be a float it promptly turns all our non-numeric values into NULLs - not very useful.
If there was some other way to read an Excel file not using the standard ODBC/OLEDB drivers that might help.
We have tried saving it into various other formats before importing but of course all these exports use the Excel driver which has the problem.
I think the closest we have got is to save it as XML (which is frankly huge at 800MB) and then shred it using standard xpath queries and some pretty dodgy workarounds to handle no doubt well-formed but still tricky variations on how column data is represented.
Edit: changed title to more closely reflect the issue

As well as the registry key, when connectting to your excel file have you tried setting the following:
;Extended Properties="IMEX=1"
See here
Also see this MSDN article

Character issue in WordPress site

I have a problem with a WordPress site (it's in Swedish). For some reason I can't use all characters when I'm writing posts - the characters å, ä and ö become Ã¥ Ã¤ Ã¶. The site is a webshop and I have the Woocommerce plugin installed. The same problem with åäö occurs in the long product descriptions of Woocommerce.
Anyone know what I can do to solve this? The character encoding in WordPress admin panel is set to UTF-8 and so is the database charset in wp-config.
In the database in phpmyadmin the collation of the wp-posts tables are set to "utf8_general_ci". Is that the problem?
This thing has never happened to me before, even though I have built a lot of WP sites in the past. Therefore I don't know what to do. Maybe the solution is simple but I want to know what I'm doing before doing anything so I don't risk messing up the site.
Would really appreciate some help with this, thanks.

When "national special characters", ie. non-ASCII characters, are displayed wrong, you probably have an error related to charset. The easiest way to fix this is usually to make sure that you are using UTF-8 everywhere.
(For Swedish in particular, you can use ISO-8859-1 (worst), ISO-8859-15 (better) or UTF-8 (best).)
You need to use the same charset everywhere, from the database to the HTML declaration.
In your theme's header.php file, please make sure that the declared charset is
UTF-8.
In your text editor or on your server, please make sure your theme files are being saved as UTF-8.
In MySQL, please make sure that the table schema is set to use utf-8.
In MySQL, please make sure that connections default to use UTf-8: mysql --default-character-set=utf8
In PHP, try setting the connection to utf-8 with mysqli_set_charset

In order to fix the Character Encoding Mismatch Problem in WordPress,
Open the ‘wp-config.php’ file in a text editor(the wp-config.php file can be found on the directory where you installed WordPress).
Find the following two lines and comment them out:
define(‘DB_CHARSET’, ‘utf8′);
define(‘DB_COLLATE’, ”);
They should look like the following after you comment them out:
//define(‘DB_CHARSET’, ‘utf8′);
//define(‘DB_COLLATE’, ”);
Now upload the updated ‘wp-config.php’ file to your webhost.
This character encoding problem can happen after a database upgrade too so it doesn’t hurt to keep this trick in your mind just in case.

In another case, if you are using PHP Dom (loadHTML) somewhere, there is a need to load HTML as UTF-8. I have fixed it by:
Replacing
#$dom->loadHTML($html);
to
#$dom->loadHTML('<?xml encoding="UTF-8">' . $html);

Getting data from mdb database file in my Windows program

I have for some time helped a customer to export mdb table data to csv files (and then to further process these csv files). I have used Ubuntu, so mdbtools (mdb viewer) has been available to me. Now the customer wants me to automate the work I do in the form of a Windows program. I have run into two problems:
After some hours, I still haven't found a free tool on Windows that can export my table data in a way that I can incorporate in a program/script. Jackcess (jackcess.sourceforge.net) looks promising, but when running the downloaded jar a totally unrelated Nokia Suite program pops up...
I have managed to open two of the tables in a python program by using the pyodbc module, but all the other tables fail to open because of "no read permissions". Until now I thought that there were no access restrictions on the database, because mdb viewer on Ubuntu opens all tables without any fuzz. There is no other file available to be, just the mdb file. One possibility might be that this is not a permissions problem at all, but a problem with special characters in column names. All the tables that I cannot open have at least one column name with a national character, whereas the 2 two tables I can open do not. I tried to use square brackets in the SQL select called from python, like so:
SQL = 'SELECT [colname] from SomeTable;'
but it makes no difference. I cannot fetch data from the columns that do not contain national characters either (except from the 2 two tables that do work).
If it indeed is a permission problem, any solution must also be possible for my program to perform, there must not be any manual steps.
Edit: The developer of the program that produces the mdb files has confirmed that there is no restrictions for any tables. So, the error message "no read permissions" is misleading. I will instead focus on getting around what I presume is a problem with national characters in column names. I will start with the JSDB approach suggested below. Thanks everyone!
Edit 2: I made a discovery that I feel is important: All tables that I can open using pyodbc have Owner=Admin whereas all tables that I cannot open have no owner at all (empty string it seems, "Owner=").
Edit 3: I gave JDBC a shot. Same error again, as one could expect given the finding in Edit 2. Apparently the problem to solve is the table ownership (although MDB Viewer under Linux doesn't seem to care about that...). Since the creator of the files says he didn't introduce any permission settings, I guess the strange table ownership could be the result of using new programs (like 2010) to read data produced in a old program (like sometime in the 90s), or were introduced during some migration of the old program. Any ideas on how to solve it?

You might be able to use VBScript. VBScript is usually used in ASP files for web pages, but can be used stand alone as a Windows program as well.
VBScript is free as it's code you write in Notepad.
Others may come up with better answers for you. Good luck.

French character 'à' is being read as 'Ã' in Classic ASP server-side and database

I have a form that accepts text and is posted to the server.
If a user were to input a French character such as 'à', it will be read as 'Ã' by Classic ASP code and be stored as 'Ã' in a SQL Server 2005 database.
A similar affect happens to other accented characters. What's happening?

It's a problem of character encoding. Apparently your server and database are configured with charsets Windows-1252 or ISO-8859-1, and you're receiving UTF-8 data.
You should check that your server sends a Content-Type or a Content-Encoding header with values ending with "charset=iso-8859-1".
I guess your server doesn't send the charset of the documents, and people with default configuration set to UTF-8 send UTF-8 characters which are stored as iso-8859-1 (or Windows-1252) in your database.

See my answer here for the detail on what is likely happening.
Utlimately you need to ensure the encoding used in the form post matches the Response.CodePage of the receiving page. You can configure the actual character set sent by a form by placing the accept-charset attribute on the form element. The default accept-charset is the documents char-set.
What exactly do you have the ASP files codepages set to (both on the page containing the form and the page receiving the post)?
What are you setting the Response.CharSet value to in the form page?

I have just gone around in circles trying to fix this once and for all in my old classic asp app which uses jquery ajax posts to store info in a database. Tried every combination with no luck..
Ended up modifying any sql selects by using the stored proc mentioned here and magic happened. Data is still stored corrupted in the database, but is displayed correctly on the site.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight