Symfony translation messages with arrays, find length - arrays

Setup:
Consider the situation of a multilingual site built with Symfony 3, using YAML files for translation messages and TWIG templates for display.
In one section of content, in English, there should be an instruction showing 3 items each representing a step of a process. In German there should only be 2, while in Spanish there are 50.
The setup for this may be for example:
#messages.en.yaml
root_node:
-
title: English title 1
item: Some first list item
-
title: English title 2
item: Some second list item
-
title: English title 3
item: Some third list item
#messages.de.yaml
root_node:
-
title: German title 1
item: Some other first list item
-
title: German title 2
item: Some second German list item
#messages.es.yaml
root_node:
-
title: Spanish title 1
item: Some Spanish first list item
...
etc ...
Considering this structure, ideally, I would like the template to read the fact that root_node is an array, get it's length so that we can display the total number of steps, then create a structure in the twig template that loops over the array and outputs the steps as some translated title and item text.
Attempting to read the array length by {{'root_node'|length}} and {{('root_node'|trans)|length}} just give the length of the text though and {{root_node|length}} gives an error as the variable is not in the scope of the template, because it should be looked up through the translation service.
Question:
How can I read the length of the array at this translation key?
Should this even be attempted in this manner? If not, is there a best practice for translating arrays of unknown size (dependent on translation language) in Symfony?

There is no direct ability to achieve such behavior into Symfony translator. However it can be achieved indirectly by using pluralization mechanism in Symfony.
You can set custom pluralization logic by using PluralizationRules::set method and then call Translator::transChoice with passing step number as argument for choice. You will need to use transchoice(index) filter into your template to obtain required step.
You can obtain list of steps for particular locale by counting them separately from translation catalogue and passing to template. Alternatively, depending on your application's logic, you can simply obtain translated message for current step and test if it is empty string or not.

Related

Using AWS Textract to classify given document pages structure into headlines and paragraphs

I have been searching all over the internet for a way to extract a meaningful page structure from an uploaded document (headlines/titles and paragraphs). The document could be of any format but I'm currently testing with PDF.
Example of what I'm trying to do:
Upload PDF file client-side
Save it to S3
Request AWS textract to detect or analyze text in that S3 object
Classify the output into: Headlines and Paragraphs
My application is working fine until step 3 and AWS textract outputs the result as blocks, block types can be either page, line or words and each block has a Geometry object which includes bounding box details and Polygon object as well (More info here: AnalayzeCommandOutput(JS_SDK) and AnalayzeCommandOutput(General)
However, I still need to process the output and classify it into headlines (e.g. 1 block of type line could be a headline and the following 3 blocks of type line are a single paragraph) so the output of step 4 would be:
{
"Headlines": ["Headline1", "Headline2", "Headline3"],
"Paragraphs": [{"Paragraph": "Paragraph1", "Headline": "Headline1"}, {"Paragraph": "Paragraph2", "Headline": "Headline1}
The unsuccessful methods I tried:
Calculate the size of bounding box of a line relative to the page size and comparing it the average bounding box sizes if it's greater then it's a headline if it's smaller than or equal it's a paragraph (not practical)
Use other PDF parsers but most of them just output unformatted text
Use the "Query" option of analyze document input but it would require to define each line in the PDF as key value pairs to output something meaningful. As per here So the PDF content would be something like:
Headline1: Headline
Paragraph1: Paragraph
Paragraph2: Paragraph
Headline2: Headline
Paragraph1: Paragraph
I'm not asking for a coding solution. Maybe I'm overcomplicating things and there is a simpler way to do it. Maybe someone has tried something similar and can point me into the right direction or approach.

How to verify words one by one on a locator in Robot Framework?

I am still a bit new to the robot framework but please rest assured I am constantly reading its User Guide. I am a bit stuck now with one test case.
I do have a list of individual words, that I need to verify on a page, mostly German translations of field labels if they appear correctly or are found in an element at all.
I have created a list variable as follows:
#{GERMAN_WORDS} | Benutzer | Passwort | Sendung | Transaktionen | Notiz
I have the following locator that contains the text labels on the webpage, and the one I need to verify:
${GENERAL_GERMAN_BOARD} |
xpath=//*[#id="generalAndIncidents:generalAndIncidentsPanel"]
I would like to check every single word one by one from the list variable, whether they are present in the locator above.
I did create the following keyword for this purpose, however I might be missing something because it calls the entire content of my list variable, instead of checking the words from it one by one:
Block Text Verification
[Arguments] ${text_list_variable} ${locator_to_check}
Wait Until Element is Visible ${locator_to_check}
FOR ${label} IN ${text_list_variable}
${labelTostring} Convert to String ${label}
${isMatching} = Run Keyword and Return Status Element Should Contain ${locator_to_check} ${labelTostring}
Log ${label}
Log ${isMatching}
Exit For Loop If '${isMatching}' == 'False'
END
I am getting the following output for this:
Element
'xpath=//*[#id="generalAndIncidents:generalAndIncidentsPanel"]' should
have contained text '['Benutzer', 'Passwort', 'Sendung',
'Transaktionen', 'Notiz']' but its text was.... (and it lists all the
text from my locator)
So, it is basically not checking the words one by one.
Am I doing something wrong here? Is this a bad approach I am trying to do here?
I would be grateful if anyone could provide me some hint on what I should do here instead!
Thank you very much!
You've made one small but crucial mistake - the variable in this line here:
FOR ${label} IN ${text_list_variable}
, should be accessed with #:
FOR ${label} IN #{text_list_variable}
The for-in loops in RF expect 1 or more arguments of the looped over values, and the # expands a list variable to its members.

Can the answer unit content array returned by the Watson Document Conversion service ever have more than one element?

I am writing a program that takes advantage of IBM Watson's Document Conversion service to convert documents of various types into answer units. Each answer unit that is returned by the service contains an array named content which is composed of objects having a media_type and a text element.
I've never seen more than one element in this content array, and I'm not sure how to handle them if there were. Can there ever be more than one element in this array and, if so, what are the possible values? Will they all have the same media_type value? My plan at the moment is to combine all of the text elements into one if more than one exists.
The answer unit content array can have more than one element (if you request that - see below). If it does, each element in the array will be a different media type representation of the same contents.
You can get this by putting more than one output media type in your request. When you do this, the output content array will contain more than element - with an element for each of the media types you request.
For example, if your request contained a config like this:
{
conversion_target : 'answer_units',
answer_units : {
output_media_types : ['text/plain', 'text/html']
}
}
(see https://www.ibm.com/watson/developercloud/document-conversion/api/v1/#convert-document for explanation of where you put config)
Then the content in your response will contain:
content : [
{
text : <the plain text contents of the answer unit>,
...
},
{
text : <the HTML contents of the answer unit>,
...
}
]
If you don't specify the output media type parameter, you'll get the default value which is:
output_media_types : ['text/plain']
This is why you're always getting an array of length 1, with a text version of the output. Because implicitly, by leaving it with the default config, you're asking for one output media type.
The Answer Units converter currently only splits by heading tags (<h1> and <h2> by default). If you want to split your answer units more granularly, you can change the level at which it splits by passing in a custom configuration:
{
"answer_units": {
"selector_tags": ["h1","h2","h3","h4","h5","h6"]
}
}
See https://www.ibm.com/watson/developercloud/doc/document-conversion/customizing.shtml#htmlau

Angular strict spell search

How can I search names starting from first letter that user types, I want to know if user types B then names starting from B must be displayed rather than "SBI" word where B comes in second position. I want to search specific words which has first word match.
for example: if
list ={SBI,
BSI,
isb,
bsisib,
be happy,
dont worry,
hello}
Then If i type h character then I want all the words starting from 'h'. but when i tried it shows 'be happy' 'hello' result for h. I want hello only. Thank you
If you're list isn't extremely long try applying a regular expression on each item in ng-repeat filter.
Not sure if you want first word or first character in each word.
Try the following plunker.
If you are using this on input field, then
I suggest you to use an existing autocomplete solution for angularjs. There are many of them out there and are quite easy to use.
http://ngmodules.org/modules/ngAutocomplete
Plunkr:http://plnkr.co/edit/il2J8qOI2Dr7Ik1KHRm8?p=preview
http://angular-ui.github.io/bootstrap/
Check the typeahead from angular ui bootstrap.
Or just google angularjs autocomplete, you will find tons of results.

Get a Done list with doxygen

It is well known how to obtain a TODO list in Doxygen, typing:
\todo Item one
\todo Item two
and so on, but when something has been done, how to keep track of this?
If I have done item two I don't want to remove it, I want to mark it as done:
\todo Item ono
\done Item two
How do I do this?
I dug around in the Doxygen documentation and stumbled over the \xrefitem. It's supposed to be:
A generalization of commands such as \todo and \bug. It can be used to
create user-defined text sections which are automatically
cross-referenced between the place of occurrence and a related page,
which will be generated. On the related page all sections of the same
type will be collected.
The first argument is an identifier uniquely representing the
type of the section. The second argument is a quoted string
representing the heading of the section under which text passed as the
fourth argument is put. The third argument (list title) is used as the
title for the related page containing all items with the same key. The
keys "todo", "test", "bug" and "deprecated" are predefined.
So you could specify a new alias, e.g. "done" in your Doxyfile:
ALIASES += "done=\xrefitem done \"Implemented TODOs\" \"Implemented
TODOs\" "
And in your code you should be able to use the new "done" tag like all the others:
/// \done fixed broken function
According to the doxygen manual there is no such "inverse" of the \todo command. Perhaps you can just keep the \todo and mark it manually as done, somehow.
Unfortunately doxygen's Markdown doesn't seem to support strikethrough (unlike Stack Overflow's, obviously), that would otherwise have been a good and common choice. Perhaps you can set it up using custom styling and spans.

Resources