Interpolating custom data onto a PDF - angularjs

I am building an Angular test preparation app (with Laravel 5.1 API). One of the requirements is to allow the user to print a certificate of achievement.
The client wants the person's name and credentials interpolated into the document (e.g., highlighted below). Here is a snapshot of the PDF template they sent:
The way I'm handling PDF viewing is simply by storing the file on S3 and giving them a link to that file.
Interpolating information into a PDF doc doesn't seem trivial and I haven't found much information on programmatically allowing this, but there are tools like DocHub, that allow you do edit while viewing the PDF.
I'm interested in learning:
is doing this programmatically trivial?
are there 3rd party tools I'm unaware of?
would I even be able to send this information along to the S3 link to interpolate in the first place?

Using PDF as a format for editing is usually a bad choice. If you have a form with fixed fields, then it's easy. Create a PDF template with an interactive form. In this form, based on AcroForm technology, you'll define fields with fixed coordinates, and a fixed size. You can then add content to these fields.
One major disadvantage with this approach is the lack of flexibility. Did you notice that I used the word "fixed" three times in the previous paragraph? If text doesn't fit the predefined field, you're out of luck. If the field is overdimensioned, you'll end up with plenty of white space. This approach is great if you can predict what the data will be like. A typical use case is a ticket or a voucher. For instance: the empty form is a really nice page, with only a couple of fields where an automated system can put a name, a date, a time, and a seat number.
This isn't the best approach for the example you show in your screen shot. The position of every line of text, every word, every character is known in advance. If you want to replace a short word with a long word (or vice-versa), then all those positions (of each line, of the complete page, possibly of the complete document) need to be recalculated. That's madness. Only people with very poor design skills come up with such an idea.
A better idea, is to store the template as HTML. See for instance chapter 5 of iText's pdfHTML tutorial, where we have this snippet of HTML:
<html>
<head>
<title>Invitation to SXSW 2018</title>
</head>
<body>
<u><b>Re: Invitation</b></u>
<br>
<p>Dear <name>SXSW visitor</name>,
we hope you had a great SXSW film festival experience last year.
And we would like to invite you to the next edition of SXSW Film
that takes place from March 9 until March 17, 2018.</p>
<p>Sincerely,<br>
The SXSW crew<br>
<date>August 4, 2017</date></p>
</body>
</html>
Actually, it's not really HTML, because the <name> tag and the <date> tag don't exist in HTML. All HTML processors (browsers as well as pdfHTML) ignore those tags and treat their content as if the tag was a <span>:
It doesn't make much sense to have such tags in the context of pure HTML, but it does make a lot of sense in the case of pdfHTML. With pdfHTMLL, you can configure custom tags, and have a result that looks like the PDFs shown below:
Look at the document for "John Doe" and compare it with the document for "Bruno Lowagie". The name "John Doe" is much shorter than my name, hence more words fit on that first line. The text flows nicely (we could also have chosen to justify the text on both sides). This "flow" is impossible to achieve with your approach, because you will never get a PDF template to reflow nicely.
OK, I get it, you probably say, but what about the practical aspects? You talk about a Java / .Net library, but I am working with Laravel and Angular.js. First, let me tell you that I don't think you'll find any good PDF tools for Laravel or Angular.js, because of the nature of PDF and those development environments (in my opinion, those technologies don't play well together). Regardless of my opinion, this shouldn't be much of a problem for you because you work in an Amazon environment. AWS supports Java, and the Java code needed to get pdfHTML working is minimal. Most of the code samples I wrote for the pdfHTML tutorial are shorter than 15 lines. So why not try Java and pdfHTML?

If you're already using Amazon services, why not use an amazon lambda function, in combination with iText7 (java), to generate the pdf on demand?
That way, you are guaranteed that the pdf is correct, and has nice layout every time.
Generating the pdf can either be done by:
converting HTML,
programmatically creating your entire document,
filling and flattening an XFA form.
I think for your use-case, either option 1 or 2 are the most sustainable.

Related

How to generate text from wikidata json

I am looking for pointers for libraries or methods that would be able to generate full text from the structured information returned by Wikidata - if possible in multiple languages.
To be clearer: from data like the one provided here (this is the JSON version) I would like to be able to generate text similar to the intro paragraph of the wikipedia page for the same item:
Orvieto Cathedral (Italian: Duomo di Orvieto; Cattedrale di Santa Maria Assunta) is a large 14th-century Roman Catholic cathedral dedicated to the Assumption of the Virgin Mary and situated in the town of Orvieto in Umbria, central Italy.
The reason is that the text is provided by Wikipedia for all those cases where a page exists, but I would like to have something also for the Wikidata items without a wikipedia page.
My problem #1 here is: I don't know what something like this is called, so I have no idea what to google for. Any pointers to start from are appreciated, including services or APIs.
This problem falls under the Data to text generation task. I do not know of any services that are currently offering solution. You can look at WEBNLG challenge which has the same objective and similar data. AFAIK mostly template based methods are used to automatically insert data from wikidata into wikipedia as text.

Script tags in React JSON-LD Schema

Summary
I'm implementing Schema.org JSON-LD structured Data into a React application that's very content heavy. I've got it all set up properly, but I'm questioning whether the way I set it up is best practice or acceptable?
The Question
I have tags minified throughout the body of my code within each element. I question this approach because it seems inefficient to have script tags all throughout the body rather than trying to consolidate them in the head tag under 1 big script tag with all the JSON-LD.
Example:
Let's say I have an eCommerce category with a lot of products on the page. Each product is contained in a <div>. Within each product div I'm providing a schema.org tag.
<div className="product-1-example">
<script type="application/ld+json">{"#context":"http://schema.org/", "#type":"Product","name":"3rd thing"}</script>
</div>
<div className="product-2-example">
<script type="application/ld+json">{"#context":"http://schema.org/", "#type":"Product","name":"3rd thing"}</script>
</div>
Here's a screenshot if the example above doesn't help of how the code is outputting:
Is this an OK approach? It just seems bizarre to me to have script tags like this all over the place? The problem I'm having as well is that because of my component structure, I can't really bundle up 1 nice tag at the top with all the consolidated structured data (i.e. grouping all the product JSON-LD data into 1). I could maybe build a script tag at the top with most of the data, and then fill out the rest with microdata?
The only way to really know is to test with the systems you want to read your markup.
There is no reason doing it that way is wrong. And I presume its done that way as its added at the point the system is processing those entities. Maybe neater to have each add their own script at the top instead of inline if possible.
I personally prefer to keep entities in their own scripts. If there is a bug in one, it will not stop the others from being parsed. You can have entities cross reference to each other by their ids.
Try not to mix with microdata. You can't cross reference ids between the two.
You probably also need to think about your entity structure. Typically you only want one main top level entity that represents what the page is about. Some other top level entities are fine as they are considered WebPage related, e.g. BreadcrumbList. But you don;t want to send mixed messages. e.g. if you mark up 10 products, which is the one the page is about? If you mark up a Product and Article. is the page an Article or about a Product?

DotNetNuke parse HTML before display

Could anyone tell me if there's some way of "hooking in" to DotNetNuke so that I can, for example, search and replace text for ALL HTML modules on the site?
e.g. if I use an HTML editor and enter the text {{replace_me}}, then I could have some code that detects "{{replace_me}}" every time a page is rendered and replace it with something else.
Please note that this is a simple example - there may be other ways of "replacing" text - however the actual use case we have is very specific and there will be some significant processing to decide what to replace :) - so whatever solution we implement should basically be:
Get HTML from DB -> Process it however we wish in full C# -> Deliver the modified string.
Thanks!
I believe you can do this with the use of an HTTPModule. Ifinity.com.au used to sell a module that did this, looks like you might be able to download it now for free (maybe?) at http://www.ifinity.com.au/Products/Inline_Link_Master/Product_Details

How to add a custom field into template.php using Zen sub theme

First time poster here, I'm a designer not skilled at all with php and I have a small issue I don't seem to be able to solve. I'm making a site in drupal 7 using a sub theme on zen.
Btw this is a great CMS, even though people say it's really more a developers CMS. I have no trouble to do what I need using views, rules, display suite etc. So a big thank you for all the developers out there making this such a good CMS. But for this apparently simple problem... no module will help me (I think) and I'm kinda stuck.
So here it is: I'd like to add a subtitle next to the title in all my pages.
So what I did was to add a custom field into the content type basic page (machine name: field_sub_title) which is a simple text field.
I uncommented the following line in my template.php
function mytheme_preprocess_page(&$variables, $hook) {
$variables['sub_title'] = t('field_sub_title');
}
Now my question is how do I load the content of my custom field into that variable?
I know i need to change the second part, but I don't have a clue as into what I need to change this.
Displaying the variable into the the page.tpl.php is something I know about so I only need help with the first part.
{EDIT}
Ok I found how to do this :)
I was looking for a solution in the wrong place. I don't need to change any thing in the template.php file.
Just needed to add this bit of code into my page.tpl.php:
<?php
print $node->field_sub_title['und'][0]['value'];
?>
So I'm posting this here for other Drupal newbies struggling with this....
Your solution may work for now, but there may be a more Drupal-y way to handle a problem like this. If you haven't noticed any problems yet, you may find one or more of the following issues down the road:
Someone who doesn't know php or Drupal theming may need to change the way this works.
If you're like me, you may forget where exactly in code this was implemented.
You may see superfluous markup and/or errors on nodes (content) that do not have this sub-title field (ie. event content not having a sub-title field while basic pages and news articles do).
When you add a field to a content type, it will automatically appear anytime content in that content type is displayed. You should be able to add the sub-title field for your page, event or whatever else you need and have it automatically appear in the markup.
You can 'manage display' of a content type to drag and drop the order for fields to appear. You could take it a step further by using a module like Display Suite to add formatting or layout per-content type.
If you feel like this isn't good enough and the markup for the subtitle must be at the same level as the page title (which is rare), at least add an if statement to make your code check to see if the variable is present before trying to print it. I'd also add a new variable and comments for code readability.
<?php
$subtitle = $node->field_sub_title['und'][0]['value'];
if($subtitle){
print $subtitle;
}
?>
Consider using field_get_items or field_view_value, or at least use the LANGUAGE_NONE constant instead of 'und'
See https://api.drupal.org/api/drupal/modules%21field%21field.module/function/field_get_items/7 and https://api.drupal.org/api/drupal/modules!field!field.module/function/field_view_value/7
This has the added benefit of reducing the number of potential security holes you create.

How do you build a multi-language web site?

A friend of mine is now building a web application with J2EE and Struts, and it's going to be prepared to display pages in several languages.
I was told that the best way to support a multi-language site is to use a properties file where you store all the strings of your pages, something like:
welcome.english = "Welcome!"
welcome.spanish = "¡Bienvenido!"
...
This solution is ok, but what happens if your site displays news or something like that (a blog)? I mean, content that is not static, that is updated often... The people that keep the site have to write every new entry in each supported language, and store each version of the entry in the database. The application loads only the entries in the user's chosen language.
How do you design the database to support this kind of implementation?
Thanks.
Warning: I'm not a java hacker, so YMMV but...
The problem with using a list of "properties" is that you need a lot of discipline. Every time you add a string that should be output to the user you will need to open your properties file, look to see if that string (or something roughly equivalent to it) is already in the file, and then go and add the new property if it isn't. On top of this, you'd have to hope the properties file was fairly human readable / editable if you wanted to give it to an external translation team to deal with.
The database based approach is useful for all your database based content. Ideally you want to make it easy to tie pieces of content together with their translations. It only really falls down for all the places you may want to output something that isn't out of a database (error messages etc.).
One fairly old technology which we find still works really well, is to use gettext. Gettext or some variant seems to be available for most languages and platforms. The basic premise is that you wrap your output in a special function call like so:
echo _("Please do not press this button again");
Then running the gettext tools over your source code will extract all the instances wrapped like that into a "po" file. This will contain entries such as:
#: myfolder/my.source:239
msgid "Please do not press this button again"
msgstr ""
And you can add your translation to the appropriate place:
#: myfolder/my.source:239
msgid "Please do not press this button again"
msgstr "s’il vous plaît ne pas appuyer sur le bouton ci-dessous à nouveau"
Subsequent runs of the gettext tools simply update your po files. You don't even need to extract the po file from your source. If you know you may want to translate your site down the line, then you can just use the format shown above (the underscored function) with all your output. If you don't provide a po file it will just return whatever you put in the quotes. gettext is designed to work with locales so the users locale is used to retrieve the appropriate po file. This makes it really easy to add new translations.
Gettext Pros
Doesn't get in your way while coding
Very easy to add translations
PO files can be compiled down for speed
There are libraries available for most languages / platforms
There are good cross platform tools for dealing with translations. It is actually possible to get your translation team set up with a tool such as poEdit to make it very easy for them to manage translation projects
Gettext Cons
Solves your site "furniture" needs, but you would usually still want a database based approach for your database driven content
For more info on gettext see this wikipedia page
They way I have designed the database before is to have an News-table containing basic info like NewsID (int), NewsPubDate (datetime), NewsAuthor (varchar/int) and then have a linked table NewsText that has these columns: NewsID(int), NewsText(text), NewsLanguageID(int). And at last you have a Language-table that has LanguageID(int) and LanguageName(varchar).
Then, when you want to show your users the news-page you do:
SELECT NewsText FROM News INNER JOIN NewsText ON News.NewsID = NewsText.NewsID
WHERE NewsText.NewsLanguageID = <<Session["UserLanguageID"]>>
That Session-bit is a local variable where you store the users language when they log in or enters the site for the first time.
Java web applications support internationalization using the java standard tag library.
You've really got 2 problems. Static content and dynamic content.
for static content you can use jstl. It uses java ResourceBundles to accomplish this. I managed to get a Databased backed bundle working with the help of this site.
The second problem is dynamic content.
To solve this problem you'll need to store the data so that you can retrieve different translations based on the user's Locale. (Locale includes Country and Language).
It's not trivial, but it is something you can do with a little planning up front.
#Auron
thats what we apply it to. Our apps are all PHP, but gettext has a long heritage.
Looks like there is a good Java implementation
Tag libraries are fine if you're using JSP, but you can also achieve I18N using a template-based technology such as FreeMarker.

Resources