Is there a such thing as a preprocessor whose statements, once processed, disappear completely and get replaced by the target language syntax permanently?
I want to research it on the web but I don't know what term to search for. If I search for "code generator", "templating language", "preprocessor directives", "mixins", "annotations" I get generators whose input becomes the source of truth.
The closest thing I can think of is a macro.
What I'm trying to do
I often have to write code that is verbose and unnecessary manual labor and am looking for a smarter way to input at least the majority of it and have it automatically transformed and only source-control the output (and hand edit if necessary). For example:
Java code - Instead of writing getters/setters, javadoc (perhaps the transformer can be a maven plugin)
HTML - I just want to add URLs, and have my preprocessor automatically convert them to links, images, videos, audio etc. depending on the file extension with some regex substitution (currently I run a perl script via a cron job)
I just want to use it as my own shorthand and not enforce it in my project and make the output editable so that others have to learn a new framework or language (like Protobuf, Stringtemplate, GWT, C hash-defines, PHP, JSP etc).
There should be no direct clue that I used a template/preprocessor to generate it.
What you want is a "program transformation system". See https://en.wikipedia.org/wiki/Program_transformation. (This is a superset of "transpilers" [ugly term]).
A good source-to-source transformation system will let you apply rewrite rules of the form of:
if you see *this*, replace it by *that* if *this_condition*.
You can then take your source code, and run a set of rewrite rules across that code to change it.
The resulting code is "transformed"; the rewrite rules are not visible.
It seems like Transpiler is one way to describe it.
Related
I'd like to use the PlantUML syntax to define component structures, which I want to process in an own tool. However, I'd like to avoid having to write a PlantUML parser. Is there some sort of intermediate representation in PlantUML, which I could use for that? It would be perfect to have e.g. a JSON structure which contains all diagram objects and relations among them in a concise way.
I could not find anything in the docs, maybe someone with more insights in the project can help?
As Jean-Marc Volle pointed out, the project github.com/jupe/puml2code allows to process puml files and generate source code in different languages using handlebar templates. Currently the code generation is limited to classes in a puml file.
I have used puml2code as a starting point for a new project github.com/robbito/puml2json, which simplifies the process a bit, as it doesn't require handlebar. Json ist directly generated from the PlantUML code. puml2json currenlty also only supports a subset of PlantUML.
I'm currently working on a project, which has to be MISRA 2012 compliant. But in the embedded world, you can't fulfill every MISRA rule. So I have to suppress some messages generated by QA-C. What's he best solution to do this?
I was thinking about making a table in every module header file with references (\ref and \anchor) to the relevant code lines, a description, etc. The first problem is: I can't use the Doxygen markdown table feature, because then the description has to be in one line, because Doxygen tables don't support line breaking. So I thought about using a simple verbatim table, what do you think?
Or is there a way to generate such a table automatically?
Greetings
m0nKeY
According to MISRA, all such undesired rules must be handled by your deviation procedure, given that they are either "required" or "advisory". You are not allowed to deviate from "mandatory" rules. (Strictly speaking, you don't need to invoke the deviation procedure for advisory rules.)
In my experience, the safest and smoothest way by far to do this, is to not allow individual deviations on case-by-case basis. All deviations from MISRA should be stated in your company coding standard, and in order to deviate you have to update that document. Which in turn enforces approval from the document owner, who is preferably the most hardened C veteran you have in the team.
That way, you prevent less experienced team members from misinterpreting the rules and ignoring important rules, simply because they don't understand them and mistake them for false positives. There should be a rationale in the document stating why the rule you deviate from is not feasible for your company.
This means that everyone in the dev team is allowed to deviate from the listed rules at any point, without the need to invoke any form of bureaucracy.
Once you have a setup like this, simply customize your static analyser and remove/ignore the undesired warnings. That way, you get rid of a lot of noise and false warnings from the tool.
To answer your question generally: To create an aggregate occurrence list of anything in doxygen, use \xrefitem
We use this as a tool in our code review process. I tag code with a custom tag \reviewme which adds the function to a list of all code in need of peer review. The next guy can come along and clear that tag. We have another custom tag \reviewedby which does not use \xrefitem but simply puts the reivewers name and the date in the code block saying who reviewed it and when. This had gotten a bit clunky as things have scaled with larget code bases and more developers. Now we're looking into tools that integrate with our version control process to handle this better. But when we started this it worked well and fit a shoestring budget. But that example should give you an idea of is capable.
Here is a screen shot of what the output looks like - proprietary stuff and auto names redacted:
Here is how we added this custom tag as an alias to xrefitem in our doxy file as follows
ALIASES = "reviewme = \xrefitem reviewme \"This section needs peer review\" \"Documentation block or code sections that need peer review\""
To add it from the GUI, you would go to Expert->Project->Aliases and add a line like this
reviewme = \xrefitem reviewme "This section needs peer review" "Documentation block or code sections that need peer review"
Same thing, just no need to put quotes around the whole thing and escape out the inner quotes.
\xrefitem is the underpinning of how things like \todo or \bug work in doxygen. You can make a list of just about anything your heart desires.
Speaking specifically to MISRA exceptions: Lundin's post has lot's of merit. I would consider it. I think a better place to document exceptions to coding standards is in the static analysis tool its self. Many tools have their own annotations where you can categorize the rule violation as 'excused' or whatever. But generally this does not remove them from the list, it allows you just to filter or sort them. Perhaps you can use REGEX in a script that runs prior to doxygen that will replace the tool specific annotation with a custom \xrefitem if you are really concerned. Or vice vera, replace the doxy annotation with your tool's annotation.
First off I've searched my hind end off for hours now trying to find an answer, but I can't seem to find anything remotely useful. What I am trying to do is to find a way to add in code-folding to the built in batch language. Basically I love using batch, but when I have tons of code, I want to be able to hide the code I do not need to edit which will make it easier to find the code I DO need to edit. What I want is to be able to make it so if I typed "::{" (without quotes) and have finished code in the middle and end with "::}" (also without quotes).
First question, is it possible? Can I add something like this (that one could normally add in the "user defined language") to the built-in batch language?
Next question, if not, where could I figure out how to basically re-create the batch language (and add my own twists) into a new "user defined language"?
Last question, if neither of those are possible, what are my other options?
Like I said, I've researched for hours. I'm not one to ask for help on forums, but I'm desperate at this point. All I want is to use the batch language and have code folding. Doesn't seem like too much to ask, but it might be!
Thanks!
In Notepad++ you can define a language by going to the Language menu --> Define your language (at least in version 6.6.9 anyway). On the Folder & Default tab, under Folding in code 1 style, input a ( into the "Open" box. Input a ) into the "Close" box. Save this as "Windows Batch" (or at least something that doesn't conflict with the in-built language named "batch".
Until you define styles, it'll be ugly and unusable, but it should allow you to collapse / expand parenthetical code blocks as a proof of concept and see whether this project is worthy of further effort. Your next steps will be to copypaste batch keywords from %PROGRAMFILES(x86)%\Notepad++\langs.model.xml, and use the "batch" language styles from your favorite theme in Notepad++\themes\. If I were doing it, I'd input a few basic things using the GUI (like keywords, folding characters, etc.), then export to an XML file on the Desktop and copypaste the rest from a theme, search-&-replacing stuff as needed to massage the theme into your user-defined language. At the end, import your massaged XML into the Define your language dialog. It was going to be more effort than I felt like exerting, but your mileage may vary. If you decide to undertake this journey and you complete it, I hope you'll consider sharing your efforts.
This similar question has a few answers that suggest some workarounds you might find worthwhile -- in particular, hiding, rather than collapsing.
This is in C Language
I want to know how i can write a program to lookup all the input fields of a website. Any website. and then can fill them in. I can write the simple webbrowser in vbs but how can i analyse the input fields. even better would be is i could click the lookup field and it puts the name of it in a box..... that would be ideal.
Anyone can help? thanks :)
Are you sure you want to do this in C?
I ask because it is not easy. First of all, you need to be able to run the HTTP GET request against the webpage you wish to view. For this, you probably need libcurl; you definitely don't want to be writing from scratch at any rate.
Next, you need to process the html you get, finding all input fields. You do NOT want to do this using regular expressions, if anything for the sake of bobince's blood pressure. HTML is not a regular language is the bit you need to take away - you need an xml parser. Enter libxml. I'm sure there are other xml libraries out there, and even libraries for parsing html.
Finally, having done that (got the fields etc) you need to be able to populate them and submit the correct request as per the ACTION and METHOD parameters of the FORM.
This is of course assuming you know what the fields should be formatted with. And it also assumes nothing else is going on. If you have a javascript validated web form (I sincerely hope they're validating on the request too, but they might provide feedback via JS) you won't benefit from that (unless you're going to integrate JS, in which case you might as well write a browser).
This is not a trivial task and it is the reason there are accessibility standards for HTML, because otherwise it becomes tricky to interpret the form without human interaction.
Of course, this all assumes said html is well formed, which isn't always the case...
I might suggest another approach. BeautifulSoup is a well known Python web scraping library that works very well. Python as a language allows easier string manipulation too, which will dramatically cut down your development time. I'd suggest giving the need to use C some serious thought given the size and complexity of the task you want to undertake vs your need to get a result quickly. If you have a lot of time, by all means go for C.
So we are sure that we will be taking our product internationally and will eventually need to internationalize it. How much internationalizing would you recommend we do as we go along?
I guess in other words, is there any internationalization that is easy now but can be much worse if we let the code base mature and that won't slow us down very much if we choose to start doing it now?
Tech used: C#, WPF, WinForms
Prepare it now, before you write all the strings in the codebase itself.
Everything after now will be too late. It's now or never!
It's true that it is a bit of extra effort to prepare well now, but not doing it will end up being a lot more expensive.
If you won't follow all the guidelines in the links below, at least heed points 1,2 and 7 of the summary which are very cheap to do now and which cause the most pain afterwards in my experience.
Check these guidelines and see for yourself why it's better to start now and get everything prepared.
Developing world ready applications
Best practices for developing world ready applications
Little extract:
Move all localizable resources to separate resource-only DLLs. Localizable resources include user interface elements such as strings, error messages, dialog boxes, menus, and embedded object resources. (Moving the resources to a DLL afterwards will be a pain)
Do not hardcode strings or user interface resources. (If you don't prepare, you know you will hardcode strings)
Do not put nonlocalizable resources into the resource-only DLLs. This causes confusion for translators.
Do not use composite strings that are built at run time from concatenated phrases. Composite strings are difficult to localize because they often assume an English grammatical order that does not apply to all languages. (After the interface design, changing phrases gets harder)
Avoid ambiguous constructs such as "Empty Folder" where the strings can be translated differently depending on the grammatical roles of the strings' components. For example, "empty" can be either a verb or an adjective, and this can lead to different translations in languages such as Italian or French. (Same issue)
Avoid using images and icons that contain text in your application. They are expensive to localize. (Use text rendered over the image)
Allow plenty of room for the length of strings to expand in the user interface. In some languages, phrases can require 50-75 percent more space. (Same issue, if you don't plan for it now, redesign is more expensive)
Use the System.Resources.ResourceManager class to retrieve resources based on culture.
Use Microsoft Visual Studio .NET to create Windows Forms dialog boxes, so they can be localized using the Windows Forms Resource Editor (Winres.exe). Do not code Windows Forms dialog boxes by hand.
IMHO, to claim something is going to happens "in a few years" literally translates to "we hope one day" which really means "never". Although I would still skim over various tutorials to make sure you don't make any horrendous mistakes. Doing correct internationalization support now will mean less work in the future, and once you get use to it, it won't have any real affect on today's productivity. But if you can measure the goal in years, maybe it's not worth doing at all right now.
I have worked on two projects that did internationalization: a C# ASP.NET (existed before I joined the project) app and a PHP app (homebrewed my own method using a free Internationalization control and my own management app).
You should store all the text (labels, button text, etc etc) as data inside a database. Reference these with keys (I prefer to use the first 4 words, made uppercase, spaces converted to underscores and non alpha-numerics stripped out) and when you have a duplicate, append a number to the end. The benefit of this key method is the programmer has a pretty strong understanding of the content of the text just by looking at the key.
Write a utility to extract the data and build .NET resource files that you add into your project for compile. Create a separate resource file for each language. In your code, use the key to point to the proper entry.
I would skim over the MS documents on the subject:
http://www.microsoft.com/globaldev/getwr/dotneti18n.mspx
Some basic things to avoid:
never ever ever use translation software, hire a pro or an intern taking that language at a local college
never try to create text by appending two existing entries, because grammar differs greately in each language, this will never work. So if you have a string that says "Click" and want one that says "Click Now", do not try to create a setup that merges two entries, or during translation, copy the word for click and translate the word now. Treat every string as a totally new translation from scratch
I will add to store and manipulate string data as Unicode (NVARCHAR in MS SQL).
Some questions to think about…
How match can you afford to delay the shipment of the English version of your application to save a bit of cost internationalize later?
Will you still be trading if you don’t get the cash flow from shipping the English version quickly?
How will you get the UI right, if you don’t get feedback quickly from some customers about it?
How often will you rewrite the UI before you have to internationalize it?
Do you English customers wish to be able to customize strings in the UI, e.g. not everyone calls a “shipping note” the same think.
As a large part of the pain of internationalize is making sure you don’t break the English version, is automated system testing of the UI a better investment?
The only thing I think I will always do is: “Do not use composite strings that are built at run time from concatenated phrases” and if you do so, don’t spread the code that builds up the a single string over lots of methods.
Having your UI automatically resize (and layout) to cope with length of labels etc will save you lots of time over the years if you can do it cheaply. There a lots of 3rd party control sets for Windows Forms that lets you label text boxes etc without having to put the labels on as separate controls.
I just starting to internationalize a WinForms application, we hope to mostly be able to use the “name” of each control as the lookup key, without having to move lots into resource files etc. It is not always as hard as you think at first….
You could use NGettext.Wpf (it can be installed from NuGet, and yes I am the author, but I made it out of the frustrations listed in the other answers).
It is hosted this github repository, and here is the getting started section at the time of writing:
NGettext.Wpf is intended to work with dependency injection. You need to call the following at the entry point of your application:
NGettext.Wpf.CompositionRoot.Compose("ExampleDomainName");
The "ExampleDomainName" string is the domain name. This means that when the current culture is set to "da-DK" translations will be loaded from "Locale\da-DK\LC_MESSAGES\ExampleDomainName.mo" relative to where your WPF app is running (You must include the .mo files in your application and make sure they are copied to the output directory).
Now you can do something like this in XAML:
<Button CommandParameter="en-US"
Command="{StaticResource ChangeCultureCommand}"
Content="{wpf:Gettext English}" />
Which demonstrates two features of this library. The most important is the Gettext markup extension which will make sure the Content is set to the translation of "English" with respect to the current culture, and update it when the current culture is changed. The other feature it demonstrates is the ChangeCultureCommand which changes the current culture to the given culture, in this case "en-US".
I also highly recommend reading Preparing Strings from the gettext utilities manual.
Internationalization will let your product be usable in other countries, it's easy and should be done from the start (this way English speaking people all over the world can use your software), those 3 rules will get you most of the way there:
Support international characters - use only Unicode data types in files and databases.
Support international date, time and number formats - use CultureInfo.InvariantCulture when storing data to file or computer readable storage, use CultureInfo.CurrentCulture when displaying data or parsing user input, never do your own parsing, never use any other culture objects.
textual data entered by the user should be considered a black box, don't try to break it up into words or letters, especially when displaying it to the user - different languages have diffract rules and the OS knows how to display left-to-right text, you don't.
Localization is translating the software into different languages, this is difficult and expensive, a good start is to never hard code strings and never build sentences out of smaller strings.
If you use test data, use non-English (e.g.: Russian, Polish, Norwegian etc) strings.
Encoding peeks it's little ugly head at every corner. If not in your own libraries, then in external ones.
I personally favor Russian because although I don't speak a word Russian (despite my name's origin) it has foreign chars in it and it takes way more space then English and therefor tests your spacing too.
Don't know if that is something language specific, or just because our Russian translator likes verbose strings.