"?" character in MSSQL DB getting replaced with (capital A with grave accennt) when displayed by ASP script - sql-server

I'm attempting to provide support for a legacy ASP/MSSQL web application - I wasn't involved in the development of the software (the company that built it no longer exists) & I'm not the admin of the server where it's hosted, I just manage the hosting for the owners of the site via a reseller account. I'm also not an ASP developer (more a PHP guy), and am not that familiar with it beyond the basics - updating DB connection strings after server migrations, etc.
The issue is that the site in question stores the content of individuals pages in an MSSQL database, and much of the content includes links. Almost all of the internal links on the site are format like "main.asp?123" (with "123" being the ID of a database row). The problem is, starting sometime in the last 8 months or so*, something caused the links in the DB content to show up as "main.aspÀ123" instead - in other words, the "?" character is being replaced by the "À" character (capital A with grave accent). Which, of course, breaks all of those links. Note that Stackoverflow won't allow me to include that character in the post title, because it seems to think that it indicates I'm posting in Spanish...?
(*unfortunately I don't know the timing beyond that, the site owners didn't know when the issue started occurring, so all I have to go by is an archive.org snapshot from last October, where it was working)
I attempted to manually change the "?" character in one of the relevant DB records to "?" (the HTML entity for the question mark), but that didn't make any difference. I also checked the character encoding of the HTML code used to display the content, but that doesn't seem to be the cause either - the same ASP files contain hard-coded links to some of the same pages (formatted exactly the same way), and those work correctly: the "?" doesn't get replaced.
I've also connected to the database directly with the MSSQL Management Studio Express application, but couldn't find any charset/character encoding options for either the database or the table.
And I've tried contacting the hosting provider, but they (M247 UK, in case anyone is curious) have been laughably unhelpful. The responses from them have been along the lines of "durrrrrr, we checked a totally different link that wasn't actually the one that you clearly described AND highlighted in a screenshot, and it works when we check the wrong link, so the problem must be resolved, right?" Suffice it to say, I wouldn't recommend them - used to be a customer of RedFox hosting, and the quality of customer has dropped off substantially since M247 bought them.
Any suggestions? If this were PHP/MySQL, I'd probably start by creating a small test script that did nothing but fetch one of the relevant records and display it's contents, to narrow down the issue - but I'm not familiar enough with ASP to do that here, at least not without a fair amount of googl'ing (and most of the info I can find is specific to ASP.net instead).
Edit: the thread suggested as a solution appears to be for character encoding issues when writing to MSSQL, not reading from it - and I've tried the solutions suggested in that thread, none make any difference.

Looks like you're converting from UNICODE to ASCII somewhere along the line...
Have a look at this to get a quick demo of what happens. In particular, pay attention to the ascii derived from inr, versus the ascii derived from unicode...
SELECT
t.n,
ascii_char = CHAR(t.n),
unicode_char = NCHAR(t.n),
unicode_to_ascii = CONVERT(varchar(10), NCHAR(t.n))
FROM (
SELECT TOP (1024)
n = ROW_NUMBER() OVER (ORDER BY ao.object_id)
FROM
sys.all_objects ao
) t
WHERE 1 = 1
--AND CONVERT(varchar(10), NCHAR(t.n)) ='À'
;

I found a workaround that appears to do the trick: I was previously trying to replace the ? in the code with &#63 (took out the ; so that it will show the code rather than the output), which didn't work. BUT it seems to work if I use &quest instead.
One thing to note, it seemed that I was originally incorrect in thinking that the issue was only affecting content being read/displayed from the MSSQL DB. Rather, it looks like the same problem was also occurring with static content being "echo'd" by code in the ASP scripts (I'm more of a PHP guy, not sure the correct term is for ASP's equivalent to echo is). Though the links that were hardcoded as static (rather HTML being dynamically output by ASP) were unaffected. Though chancing the ? to &quest worked for those ones too (hardest part was tracking down the file I needed to edit).

Related

DBT snowflake utf-8' codec can't decode byte 0xa0 in position 1031: invalid start byte

I get the following error when I add fields (even commented out) into my query into DBT.
I am using DBT cloud, running off snowflake.
This runs fine - it even has the table from which I want the fields in the join at the bottom.
However, as soos as I put in the fields - even commented out, I get the error in the title.
Any one have any idea why this is happenning.
so in the end, I found out that somehow a non-ANSI character (like a deadspace) crept into my script between copying code to and from the DBT interface to SSMS. Still not sure how this happened. None the less, DBT has a python module that parses the text, and this python module was having an issue with that character.
So in the end, I simply had to retype my code... and then it worked.

Replace is not working for weird character

I use UPDATE a SET GR_P = REPLACE(GR_P,'','') FROM mytable a to replace things.
But replace function is not working for below charter:
In Query analyzer it works but when I used SSIS Execute SQL task or OLEDB Source then it is giving me error:
No Connection manager is specified.
In Toad against Oracle (since that's one of your tags), I issued this (pressing ALT-12 to get the female symbol) and got 191 as a result. note selecting it back using CHR(191) shows an upside-down question mark though.
select ascii('♀') from dual;
Given that, this worked but it's Oracle syntax, your mileage may vary.
UPDATE mytable SET GR_P = REPLACE(GR_P, CHR(191));
Note if it does not work, that symbol could be for another control character. You may need to use a regular expression to eliminate all characters not in a-zA-Z0-9, etc. I suspect you'll need to update your tags to get a more accurate answer.
Maybe this info will help anyway. Please post back what you find out.

How to write more than ~1,500 characters to a fusion table cell via the SQL API

I have an AppEngine app using the Google API Python Client to access the Fusion Tables API over OAuth. When trying to run UPDATE commands, the API client is putting my whole SQL statement into the query string in the URL.
So when I write a SQL statement like this...
UPDATE <table ID> SET <column> = 'some really long piece of text...' WHERE ROWID = '1'
...I get an API call like this:
POST https://www.googleapis.com/fusiontables/v1/query?sql=UPDATE+<table ID>+SET+<column>+%3D+%27some+really+long+piece+of+text...%27+WHERE+ROWID+%3D+%271%27&alt=json
All this works fine for most things. But I'm encountering errors when writing more than ~1,500 characters (depending on how many of those are special characters I have to escape) to that cell. The answer to another question says the limit to the number of characters in a cell is 1,000,000. I'm assuming this may be because the URL is getting just way too long (for something in the pipeline from AppEngine to the Fusion Tables API servers), maybe kind of like the issue addressed in this question.
With other APIs, I'm used to sending parameters in form data for POST requests not the query string, which keeps the URL a manageable size. But the Fusion Tables API docs seem to suggest that the query string is the proper place and that nothing should be sent in the request body. The API client seems to be dutifully following this pattern (and in fact using that as the default behavior for ALL Google APIs??).
So my question is threefold:
Does anyone know if the URL can get too long as I suspect is happening?
Is the query string really the only place to send the SQL statement or will if I find a way to include it in the request body will the API accept that?
If the query string really is the only way and it really can get too long, is there another way to post large strings to fusion table cells?
So I tried it. The answer to number 2 is that you CAN put the query in the request body as form data and the API will take it (contrary to what the docs suggest). My request to Fusion Tables from App Engine now looks like this:
import httplib2
import urllib
value = 'Some really long string...'
# http is an instance of httplib2.Http
http.request('https://www.googleapis.com/fusiontables/v1/query?alt=json',
method='POST',
body=urllib.urlencode({
'sql': unicode('UPDATE <table ID>' +
' SET <column>=\'' + value + '\'' +
' WHERE ROWID = \'1\'').encode('utf-8')
}),
headers={'Content-Type': 'application/x-www-form-urlencoded'})
Note that I believe the Content-Type header is necessary, but I haven't tried it without it.
I also left some unicode and UTF-8 encoding stuff in there because chances are, for anyone who needs to support a few thousand characters, a few of those characters might be non-ascii and urllib.urlencode doesn't like non-ascii characters...
I'd still appreciate an answer to numbers 1 and 3 if anyone has any more information, but this seems to work for me for now. I'm curious as to why using form data in the request body wasn't the default approach from the beginning for the Fusion Tables team...

Cell.cross() returns error in Google Refine projects

I'm trying to create a new column based on my main project's Date column that pulls timeline events from another Google Refine project:
cell.cross("Clean5 Timeline", "TimelineDate").cells["TimelineEvent"].value[0]
The dates are in the same format in both Google Refine projects. But it fills no cells, and I get this error:
Error: Cannot retrieve field from null
This — 
cell.cross("Clean5 Timeline", "TimelineDate")
— returns [ ] for rows where there should be a match.
And this —
cell.cross("Clean5 Timeline", "TimelineDate").cells["TimelineEvent"]
— returns null for those rows.
I copied the syntax directly from the GREL help files: http://code.google.com/p/google-refine/wiki/GRELOtherFunctions. Can anyone suggest what I may be overlooking?
Thanks.
Without access to your projects it's going to be difficult to answer this, but the first thing I'd suggest is that you trim back your expression to find out exactly where the null is coming from.
Since
cell.cross("Clean5 Timeline", "TimelineDate")
is returning an empty array ([]), nothing based on that result is going to work.
There are three possible problems that I can think of: 1) the project name is wrong, 2) the column name is wrong, 3) the data values don't match (or Refine doesn't think they do), or 4) you are running into a caching bug with cross() that exists in Refine 2.5.
Restarting the Refine server should clear the cache if you're running into the bug and it's also fixed in the current source repository. The fix will be included in OpenRefine 2.6.

SQL 2005 CSV Import Quote Delimited with inner Quotes and Commas

I have a CSV file with quote text delimiters. Most of the 90000 rows are fine, but I have a few rows that have a text field that contains both a quote and a comma. For example the fields value would be:
AB",AB
When Delimited this becomes
"AB"",AB"
When SQL 2005 attempts to import this I get errors such as...
Messages
Error 0xc0202055: Data Flow Task: The column delimiter for column "Column 4" was not found.
(SQL Server Import and Export Wizard)
This only seems to happen when a quote and comma are in a text value together. Values like
AB"AB which becomes "AB""AB"
or
AB,AB which becomes "AB,AB"
work fine.
Here are some example rows...
"1464885","LEVER WM","","B","MP17"
"1465075",":PLT-BC !!NOTE!!","","B",""
"1465076","BRKT-STR MTR !NOTE!","","B",""
"1465172",":BRKT-SW MTG !NOTE!","","B","MP16"
"1465388","BUSS BAR !NOTE!","","B","MP10"
"1465391","PLT-BLKHD ""NOTE""","","B","MP20"
"1465564","SPROCKET:13TEETH,74MM OD,66MM","ID W/.25"" SETSCR","B","MP6"
"S01266330002","CABLE:224"",E122/261,8 CO","","B","MP11"
The last row is an example of the problem - the "", causes the error.
I've had MAJOR problems with SSIS. Things that Access, Excel and even DTS seemed to do very well, SSIS chokes on. Variable record-length data is another problem but, yes, these embedded qualifiers are a major problem. Especially if you do not have access to the import files because they're on someone else's server that you pay to gain access to and might even be 4 to 5 GB in size! Cant just to a "replace all" on that every import.
You may want to check into this at Microsoft Downloads called "UnDouble" and here is another workaround you might try.
Seems like with SSIS in SQL Server 2008, the bug is still there. I dont know why they havent addressed this in the parser but its like we went back in time with SSIS in basic import functionality.
UPDATE 11-18-2010: This bug still exists in SSIS. Amazing.
How about just:
Search/replace all "", with ''; (fix all the broken fields)
Search/replace all ;''; with ,"", (to "unfix" properly empty fields.)
Search/replace all '';''; with "","", (to "unfix" properly empty fields which follow a correct encapsulation of embedded delimiters.)
That converts your original to:
"1464885","LEVER WM","","B","MP17"
"1465075",":PLT-BC !!NOTE!!","","B",""
"1465076","BRKT-STR MTR !NOTE!","","B",""
"1465172",":BRKT-SW MTG !NOTE!","","B","MP16"
"1465388","BUSS BAR !NOTE!","","B","MP10"
"1465391","PLT-BLKHD ""NOTE""","","B","MP20"
"1465564","SPROCKET:13TEETH,74MM OD,66MM","ID W/.25"" SETSCR","B","MP6"
"S01266330002","CABLE:224'';E122/261,8 CO","","B","MP11"
Which seems to run the gauntlet fine in SSIS. You may have to step 3 recursively to account for 3 empty fields in a row ('';'';'';, etc.) but the bottom line here is that when you have embedded text qualifiers, you have to either escape them or replace them. Let this be a lesson in your CSV creation processes going forward.
Microsoft says doubled double quotes inside double quote delimited fields just don't work. A fix is planned for the end of 2011...
In the mean time we will have to use workarounds like described in the other answers.
I would just do a search/replace for ", and replace it with ,
Do you have access to the original file?

Resources