I have a REGEX SQL CLR function:
var rule1 = new Regex("شماره\\s?\\d{1,10}")
Calling it on SQL Server 2016, however, returns this error:
System.ArgumentException: parsing "?????\s?\d{1,10}" - Quantifier {x,y} following nothing.
at System.Text.RegularExpressions.Regex..ctor(String pattern)
It seems that my unicode characters are changed to question marks, which makes the whole Regex wrong.
This issue has nothing to do with datatypes, whether for input parameters or return values, as the code provided, while sparse on detail, does show enough to see that:
there is no input parameter being used (the string is hard-coded).
the error is being thrown by System.Text.RegularExpressions.Regex, so has nothing to do with T-SQL or return values / types.
Also, while the error message does mention "Quantifier {x,y}", and there is indeed a {1,10} quantifier being used in the Regular Expression, it is a false correlation (albeit a rather understandable one) that the error message is referring to that specific quantifier. If you shorten the Regular Expression down to just "شماره", you will get the same error, except it will report the Regular Expression as being just "?????". Hence, "Quantifier {x,y}" actually refers to the first "?" in the expression shown in the error message (you will get the same error even if the Regular Expression is nothing more than "ش"). I figure that "Quantifier {x,y}" is the generalized way of looking at the ?, +, and * quantifiers as they can also be expressed as {0,1}, {1,}, and {0,}, respectively (or at least they should be).
This issue has nothing to do with SQL Server, or even Regular Expressions. This is an encoding issue, and RegEx is reporting the problem because it is being given ????? instead of شماره.
<TL;DR> Check your source code file's encoding. You might need to go to "Save As...", click on the down-arrow to the right of the word "Save" on the "Save" button, select "Save with Encoding...", and then select "Unicode (UTF-8 with signature) - Codepage 65001".
There is a problem with the project configuration and/or the compiler. I placed the following string in both a Console Application and a Database Project:
"-😈-ŏ-א---\U0001F608-\u014F-\u05D0-"
(The second half of that test string, after the ---, is merely the escape sequences for the same three characters as appear in the first half, and in the same order.)
I compiled both and inspected the compiled output (meaning: it hasn't been deployed to SQL Server yet). That string appears in the EXE file (Console App) as:
2D003DD808DE2D004F012D00D0052D002D002D003DD808DE2D004F012D00D0052D00
which is the UTF-16 LE encoding for: -😈-ŏ-א---😈-ŏ-א-
Yet, it appears in the DLL file (SQLCLR Assembly) as:
2D003F003F002D003F002D003F002D002D002D003DD808DE2D004F012D00D0052D00
which is the UTF-16 LE encoding for: -??-?-?---😈-ŏ-א-
I even changed the output type of the Console App project to be "Class Library" and the string still got embedded correctly in that DLL file. So, for some reason the literal characters are being turned into literal question marks when compiled into a SQLCLR Assembly. I haven't yet figured out what is causing this as a quick look at the config settings and command-line flags for csc.exe seems to show them being effectively the same.
In either case, it should be clear that specifying the Arabic characters via escape sequences, while cumbersome, will at least work, hence providing a (hopefully short-term) work-around so that you can move forward on this. I will continue looking to see what could be causing this difference in behavior.
UPDATE
In order to determine if the string was being converted to an 8-bit encoding or something else, I added two characters to the test string (one in both Windows-1252 and ISO-8859-1, and one only in Windows-1252):
§ = 0xA7 in CP-1252, 0xA7 in ISO-8859-1, and 0x00A7 in UTF-16
œ = 0x9C in CP-1252, not in ISO-8859-1, and 0x0153 in UTF-16
The new test string is:
"-😈-ŏ-א-§-œ---\U0001F608-\u014F-\u05D0-\x00A7-\x0153-"
That string appears in the EXE file (Console App) as:
2D003DD808DE2D004F012D00D0052D00A7002D0053012D002D002D003DD808DE2D004F012D00D0052D00A7002D0053012D00
which is the UTF-16 LE encoding for: -😈-ŏ-א-§-œ---😈-ŏ-א-§-œ-
Yet, it appears in the DLL file (SQLCLR Assembly) as:
2D003F003F002D003F002D003F002D00A7002D0053012D002D002D003DD808DE2D004F012D00D0052D00A7002D0053012D00
which is the UTF-16 LE encoding for: -??-?-?-§-œ---😈-ŏ-א-§-œ-
So, because both § and œ came through correctly in the SQLCLR Assembly, it is clearly not ISO-8859-1. And, it is either Code Page Windows-1252 or some other that supports both of those characters (CP-1252 being the most likely given that my system is using it).
Still investigating the root cause...
UPDATE 2
Ok, I feel kinda silly. Sometimes it helps to close a file (or the entire solution sometimes) and reopen it. Doing so I noticed that my test string now appeared as:
"-??-?-?-?-?---\U0001F608-\u014F-\u05D0-\x00A7-\x0153-"
Funny, I don't remember pasting that in ;-). So, I checked the file encoding that Visual Studio was saving it as and sure enough it was "Western European (Windows) - Codepage 1252". And just to be extra special certain, I checked the file for the Console App and it was correctly set to "Unicode (UTF-8 with signature) - Codepage 65001". D'oh! Changing the file encoding under "Save As..." to "Unicode (UTF-8 with signature) - Codepage 65001", I then replaced both the test string and the O.P.'s Regular Expression. Both came through perfectly, no errors or question marks.
I have a Zortrax m200 3d printer which you may/ may not be familiar with. It is closed source, and uses its own proprietary software to produce Z-code files which should in principal be almost identical to G-code.
My curiosity has kicked in and I'm wondering whether there is a way to decrypt a Z-code file or convert a g-code file to z-code. How would one go about investigating this?
Here is a z-code file:
[https://drive.google.com/file/d/0ByYqoSxe29qtS05UZlpDclBZNWs/view?usp=sharing][1]
Yes, possible to do it. You can find a good tool (with source code) here: https://github.com/bonafid3/zcode2gcode
to convert ZCode to GCode.
In file text.txt I have this sentenc:
"Příliš žluťoučký kůň úpěl ďábelské ódy."
(I think Windows uses Windows-1250 code page to represent this text.)
In my program I save it to a buffer
char string[1000]
and render string with ttf to SDL_Surface *surface
surface = TTF_RenderText_Blended(font, string, color);
/*(font is true type and support this text)*/
But it gives me not correct result:
I need some reputation points to post images
so I can only describe that ř,í,š,ž,ť,ů,ň,ď are not displayed correctly.
Is it possible to use ttf for rendering this sentence correctly?
(I tried also TTF_RenderUTF8_Blended, TTF_RenderUNICODE_Solid... with worse result.)
The docs for TTF_RenderText_Blended say that it takes a Latin-1 string (Windows-1252) - this will be why it isn't working.
You'll need to convert your input text to UTF-8 and use RenderUTF8, or to UTF-16 and use RenderUNICODE to ensure it is interpreted correctly.
How you do this depends on what platform your app is targeted to - if it is Windows, then the easiest way would be to use the MultiByteToWideChar Win32 API to convert it to UTF-16 and then use the TTF_RenderUNICODE_Blended to draw it.
My solution will be this:
Three input files. In first file there will be a set of symbols from czech alphabet.
Second file will be sprite bitmap file where graphic symbols will be sorted in the
same order as in first file. In my program symbols from third input file will be compared with symbols from first file and right section of sprite will be copied on sreen one by one.
I will leave out sdl_ttf. It has some advantages and disadvantages but I think it will work for my purposes.
Thanks for all responses
Is there anyway in C to get an image, stream by stream and how can I understand how many stream there are in an Image?
the Image is in JPEEG type.
and for saving this stream in another file I'll have any problem?
You can have a look to the free OpenCV library : http://opencv.org/
Here there is a tutorial with some examples : http://www.cs.iit.edu/~agam/cs512/lect-notes/opencv-intro/
It's largely used for this kind of treatments.
Without using any external library what you have as a jpeg image is simply a binary file. You may do whatever with it via fopen(), fscanf() or any other file functions.
for saving this stream in another file I'll have any problem?
No problem if you are just coping a jpeg image to another new file. But problem may be there if you change the image extension. Please have a look at here
I am looking for some input on how to programmatically convert mp4 files to fragmented f4f files with accompanying manifests.
I currently have an implementation for creating segmented MPEG2-TS files with accompanying manifest for Apples HLS, and want to create a similar piece of software for Adobes HDS.
My code is based on Libav (alternatively, ffmpeg), so I was hoping they had native support for muxing f4f files, but I have not been able to find any resources for it.
What I am specifically looking for:
How (if) the format is used in libav?
If there is any special requirements (such as the h264_mp4toannexb filter required for converting MP4 to MPEG2 TS)?
Any sample code (even if it's not using libav/ffmpeg)
An easy-to-read manifest specification.
I'm afraid you have to read mp4/f4f specification, and implementation it your self.
MP4 file format: ISO/IEC 14496-14
f4f file format: It is included in the f4v specification.(http://www.adobe.com/cn/devnet/f4v.html)
The code of mod_h264_streaming (http://h264.code-shop.com/trac) may be helpful.