Split single file to multiple "posts" in Hugo?

Split single file to multiple "posts" in Hugo? - hugo

Hugo's one-file->one-page model is nice and simple, especially for a blog. But sometimes you want to write an "article" for a blog and split it into 2 or more pieces for separate pages (perhaps to be posted on separate days, or whatever).
Is there a way to do this within Hugo? Perhaps a combination of something you put in a layout/theme/shortcode and internal markup within the page (to signal where to split the article)?
Possible models might include:
1 input post "splits" into 2/3/4 posts when the site is built to public
1 input post is duplicated into 2/3/4 posts when the site is built to public but somehow each duplicate isn't an exact duplicate but instead has the whole post but certain parts of the post are hidden/invisible, via CSS, such that they represent the 2/3/4 "pages" of the post.
Or, is this something you do external to Hugo?
UPDATE: I can see I need to clarify this. Consider this random illustrative blog post - it is the third of three closely related posts, and even has a set of links at the top so you can find the earlier posts in the series. Lots of technical blogs do this sort of thing (at least the ones I read).
Now, I'm not looking for a CMS or anything complex. What I do now with Hugo is hugo new posts/an-article-about-constexpr.md and I write one markdown file and it becomes one "post" in standard Hugo fashion. Exactly what you want a SSG to do.
What I want to do is write one markdown file but have some kind of markup in it separating it into sections (like  on steroids) so that instead of generating one page of my site it generates three (in this example) - three separate articles with links from the main page in the "posts" section, etc. etc. And for bonus points, I'd like to generate these "table of contents" sections with links to each of the pages.
So I've been doing that with a cobbled-up awk script that generates pages right next to the post, in the posts directory. I set the post to draft so it doesn't get published, but the pages generated by the awk script have draft=false so they do get published. And the dates get set so they're "in order".
And that's working, but before I invest more time in my little script, I wanted to see if there was a proper way to do this within hugo.

Not sure what you mean by one-file->one-page model.
I have very few parts of any hugo site which one markdown file=one rendered html page.
Could just be the way I build, but everything I've done so far has been vanilla hugo.
To answer your question: Yes, you are correct that would work. There a few ways to do this (I list one below), but maybe a deeper look would be separating the concept of a "tool-chain" and what Hugo is in that tool chain, from a CMS, which Hugo is not.
So, to possibly answer your specific question though:
You can store content in markdown, markdown front matter, or a Data form (XML/JSON) in hugo. Using the page resources {{ .GetPage }} you can access any content and load it in any template or using shortcodes, load it in other markdown.
If I needed to do this as part of a tool chain, i.e. use specific markdown and re-use it in multiple places, I would create a front matter variable, or taxonomy or tag depending on what groupings I needed where, so this was scalable. params such as
"articleAuthor: Jessie P."
"date: DATE HERE"
"tags: etc. etc."
Then lets say I know that's going to be a blog, well fine, then it will be in the corresponding content folder, but if I needed all of Jessie's articles, or articles on that date, or that specific article, I would use the shortcode I make or directly in a template, using .GetPage Match - import the markdown pages I need based on the parameters I need.
But on the other hand, I would need to understand the problem being solved, but, here are a few hugo docs to help you out:
https://gohugo.io/functions/getpage/#readout
https://gohugo.io/content-management/page-bundles/
Remember, Hugo is not a CMS, it is a site generator. If you want a CMS, you can always use Wordpress headless, or any other solution out there.
(off the top of my head using page bundles)
{{ $headlessBundle := .Site.GetPage "/blogs/specific-blog/index" }}
{{ with $getContent := $headlessBundle.Resources.Match "intro.md" }}
{{ (index $getContent 0).Content }}
(You would use various "Where" statements to "filter" content based on the params or however you delineate what you want).
Or for instance if I wanted only the text that had an H1 tag:
{{ $.Scratch.Set "summary" ((delimit (findRE "(<h1.*?>.*?</h1>\\s*)+" .Content) "[…]") | plainify | replaceRE "&" "&" | safeHTML) }}
{{ $.Scratch.Get "summary" }}
Based on the update to the question:
https://discourse.gohugo.io/t/split-markdown-content-in-two-files-but-dont-render-shortcodes-as-raw-text/32080/2
https://discourse.gohugo.io/t/getting-a-list-from-within-a-shortcode/28126
https://discourse.gohugo.io/t/splitting-content-into-sections-based-on-header-level/33749
https://discourse.gohugo.io/t/multiple-content-blocks-on-a-single-page/9092/3
jrmooring answered it best in the above with clear examples and code.
Though, note: If I was doing this in a technical blog this would be integrated into the CMS and coordinated with the builder.

Related

How to use the same layout in a different directory?

I have a site using the hugo-coder theme, which has a layouts/posts folder that specifies that anything in the "posts" folder will have a blog post format.
I would like to have two different blogs in two different subdirectories, using the same layout. Is there a way to tell Hugo that the content/blog1 directory should use the same settings and layout as the content/posts directory without copying themes/hugo-coder/layouts/posts into layouts/blog1? Ideally I would avoid using symlinks, because, while convenient, I've had a decent amount of software throw weird errors when I use symlinks, so I avoid them when it's possible.

You can set the layout or type field to posts in the frontmatter of your _index.md file in content/blog1.
See this docs page for more info.
Edit: Alternatively, you could create an archetype for blog1 that automatically sets the value to posts in the frontmatter of individual posts in that section, assuming you're using hugo new blog1/postname.md to create posts for that section.
Double edit: The first suggestion didn't work. You could also create subsections within content/posts/blog1 and set the permalinks of posts in that subsection to use the last section only. That should remove the need to explicitly set the type in post frontmatter every time because each post would already have a type of posts.
In config.toml:
[permalinks]
posts = "/:sections[last]/:slug/"

You can use a partial in your templates. If you do that you WILL need the single and list file in the layouts/blog directory, but it could be an empty file referencing the partial. The layouts/posts/single.html and the layouts/blog/single.html both will then look like this:
{{ partial "singleblog.html" . }}
Compeletely DRY... and without much complexity.

Script tags in React JSON-LD Schema

Summary
I'm implementing Schema.org JSON-LD structured Data into a React application that's very content heavy. I've got it all set up properly, but I'm questioning whether the way I set it up is best practice or acceptable?
The Question
I have tags minified throughout the body of my code within each element. I question this approach because it seems inefficient to have script tags all throughout the body rather than trying to consolidate them in the head tag under 1 big script tag with all the JSON-LD.
Example:
Let's say I have an eCommerce category with a lot of products on the page. Each product is contained in a <div>. Within each product div I'm providing a schema.org tag.
<div className="product-1-example">
<script type="application/ld+json">{"#context":"http://schema.org/", "#type":"Product","name":"3rd thing"}</script>
</div>
<div className="product-2-example">
<script type="application/ld+json">{"#context":"http://schema.org/", "#type":"Product","name":"3rd thing"}</script>
</div>
Here's a screenshot if the example above doesn't help of how the code is outputting:
Is this an OK approach? It just seems bizarre to me to have script tags like this all over the place? The problem I'm having as well is that because of my component structure, I can't really bundle up 1 nice tag at the top with all the consolidated structured data (i.e. grouping all the product JSON-LD data into 1). I could maybe build a script tag at the top with most of the data, and then fill out the rest with microdata?

The only way to really know is to test with the systems you want to read your markup.
There is no reason doing it that way is wrong. And I presume its done that way as its added at the point the system is processing those entities. Maybe neater to have each add their own script at the top instead of inline if possible.
I personally prefer to keep entities in their own scripts. If there is a bug in one, it will not stop the others from being parsed. You can have entities cross reference to each other by their ids.
Try not to mix with microdata. You can't cross reference ids between the two.
You probably also need to think about your entity structure. Typically you only want one main top level entity that represents what the page is about. Some other top level entities are fine as they are considered WebPage related, e.g. BreadcrumbList. But you don;t want to send mixed messages. e.g. if you mark up 10 products, which is the one the page is about? If you mark up a Product and Article. is the page an Article or about a Product?

How to add a telephone link via wagtail?

I am trying to add links in the form 555-555-555 arbitrarily into paragraphs of text on my wagtail site. These phone numbers are currently peppered throughout the site as plain text, but I want to convert them to links.
I found this old wagtail github issue where they explained why they would not add them, but the 'Special-purpose pages' use case they described seems to be different than mine: my site has these numbers in paragraphs of text on most of the content pages (blog, product, marketing, etc).
Can anyone explain how I can add telephone links that can be used throughout the site?
I am using wagtail 1.x

To have telephone link within rich text, you'll need to create a plugin for Hallo.js. Have a look at the documentation and how Wagtail 1.13 creates and register such plugins.
Be aware though that it's usually quite involved and that Wagtail 2.0 rich text editor is now Draftail and Hallo.js is deprecated. Therefore, if you create a Hallo.js plugin and upgrade to Wagtail 2.0, you'll have to add some configuration to continue using Hallo.js or recreate the plugin for Draftail.
FWIW, if you are interested in having a look at what would be involved with creating an plugin for Draftail, you'll need to create an entity (also note that the API for creating entities should receive some enhancements in Wagtail 2.2).

With Raw HTML there is nothing to prevent editors from inserting malicious scripts into the page. Do not use this block. http://docs.wagtail.io/en/v2.1/topics/streamfield.html#rawhtmlblock
A workaround would be a custom filter. Eg:
{{ self.text|richtext|phonify }}
In your templatetags.py:
>>> def phonify(val):
... for tel in re.findall(r'tel:(\d+)', val):
... tag = '{}'.format(tel, tel)
... val = val.replace('tel:{}'.format(tel), tag)
... return val
...
>>> phonify('Hello tel:123 world tel:456!')
'Hello 123 world 456!'
>>>
Now you can instruct editors (via help_text) to add phone numbers like tel:5555555555.
This example does not handle - and +1. But if you figure that out, I'll update the answer ;)

I ended up chopping up my paragraphs and including raw html where I needed to add the tel links. A bit tedious, and the styles were slightly different on some pages, but shorter than doing it any other way.

Interpolating custom data onto a PDF

I am building an Angular test preparation app (with Laravel 5.1 API). One of the requirements is to allow the user to print a certificate of achievement.
The client wants the person's name and credentials interpolated into the document (e.g., highlighted below). Here is a snapshot of the PDF template they sent:
The way I'm handling PDF viewing is simply by storing the file on S3 and giving them a link to that file.
Interpolating information into a PDF doc doesn't seem trivial and I haven't found much information on programmatically allowing this, but there are tools like DocHub, that allow you do edit while viewing the PDF.
I'm interested in learning:
is doing this programmatically trivial?
are there 3rd party tools I'm unaware of?
would I even be able to send this information along to the S3 link to interpolate in the first place?

Using PDF as a format for editing is usually a bad choice. If you have a form with fixed fields, then it's easy. Create a PDF template with an interactive form. In this form, based on AcroForm technology, you'll define fields with fixed coordinates, and a fixed size. You can then add content to these fields.
One major disadvantage with this approach is the lack of flexibility. Did you notice that I used the word "fixed" three times in the previous paragraph? If text doesn't fit the predefined field, you're out of luck. If the field is overdimensioned, you'll end up with plenty of white space. This approach is great if you can predict what the data will be like. A typical use case is a ticket or a voucher. For instance: the empty form is a really nice page, with only a couple of fields where an automated system can put a name, a date, a time, and a seat number.
This isn't the best approach for the example you show in your screen shot. The position of every line of text, every word, every character is known in advance. If you want to replace a short word with a long word (or vice-versa), then all those positions (of each line, of the complete page, possibly of the complete document) need to be recalculated. That's madness. Only people with very poor design skills come up with such an idea.
A better idea, is to store the template as HTML. See for instance chapter 5 of iText's pdfHTML tutorial, where we have this snippet of HTML:
<html>
<head>
<title>Invitation to SXSW 2018</title>
</head>
<body>
<u><b>Re: Invitation</b></u>
<br>
<p>Dear <name>SXSW visitor</name>,
we hope you had a great SXSW film festival experience last year.
And we would like to invite you to the next edition of SXSW Film
that takes place from March 9 until March 17, 2018.</p>
<p>Sincerely,<br>
The SXSW crew<br>
<date>August 4, 2017</date></p>
</body>
</html>
Actually, it's not really HTML, because the <name> tag and the <date> tag don't exist in HTML. All HTML processors (browsers as well as pdfHTML) ignore those tags and treat their content as if the tag was a <span>:
It doesn't make much sense to have such tags in the context of pure HTML, but it does make a lot of sense in the case of pdfHTML. With pdfHTMLL, you can configure custom tags, and have a result that looks like the PDFs shown below:
Look at the document for "John Doe" and compare it with the document for "Bruno Lowagie". The name "John Doe" is much shorter than my name, hence more words fit on that first line. The text flows nicely (we could also have chosen to justify the text on both sides). This "flow" is impossible to achieve with your approach, because you will never get a PDF template to reflow nicely.
OK, I get it, you probably say, but what about the practical aspects? You talk about a Java / .Net library, but I am working with Laravel and Angular.js. First, let me tell you that I don't think you'll find any good PDF tools for Laravel or Angular.js, because of the nature of PDF and those development environments (in my opinion, those technologies don't play well together). Regardless of my opinion, this shouldn't be much of a problem for you because you work in an Amazon environment. AWS supports Java, and the Java code needed to get pdfHTML working is minimal. Most of the code samples I wrote for the pdfHTML tutorial are shorter than 15 lines. So why not try Java and pdfHTML?

If you're already using Amazon services, why not use an amazon lambda function, in combination with iText7 (java), to generate the pdf on demand?
That way, you are guaranteed that the pdf is correct, and has nice layout every time.
Generating the pdf can either be done by:
converting HTML,
programmatically creating your entire document,
filling and flattening an XFA form.
I think for your use-case, either option 1 or 2 are the most sustainable.

Is it possible for an entry to have two URL in Expression Engine, and translate template names?

I'm currently making a bilingual Expression Engine 2.5.2 website. I'm using this technique to create the two langues, which works perfectly.
I have created a {country_code} global variable in the two index.php files which allows me to detect the current language.
Using this technique, I have no problems to get language-relative data when accessing an entry. My only concern is that I apparently have to privilege a language-specific "clean" URL.
Example entry:
{entry_id} = 123
{title} = My test article
{title_permalink} = my-test-article
{name_fr} = Mon article
{name_en} = My article
If I request http://www.example.com/index.php/en/blog/articles/my-test-article, I expect to to find, in english, "My article" using the template articles in the blog template group.
Everything is fine, but the french translation is accessible when requesting http://www.example.com/index.php/fr/blog/articles/my-test-article. The correct translation of the URL should be http://www.example.com/index.php/fr/blogue/articles/mon-article-test.
Anyone encountered a problem like this? Any solutions via extensions or modules?

I believe the Transcribe module solves this by both providing the ability to translate template group and template names, and having you create a separate entry for each language and piece of content in your site (hence, you have two separate URL titles). But that means buying into their entire methodology for a multi-lingual site.
Myself, I usually just stick to using the entry_id instead of the url_title, and live with the template names being in the primary language.

The best way I found to achieve this is by embedding templates with segment translations, duplicating template groups and duplicating channels.
In the blog/articles template:
{embed="shared/.head" segment_2_translation="blogue" segment_3_translation="articles"}
In the blogue/articles template:
{embed="shared/.head" segment_2_translation="blog" segment_3_translation="articles"}
In shared/.head template:
[...] {if lang == "fr"}English{if:else}Français{/if} [...]
And then you can create a Articles (FR) and a Articles (EN) channels, and each will have their unique URL titles. You can also add a relationship custom field for each channel to associate an entry with it's translation.
It feels messy, but it is the only way I could make it work without modules, plugins or whatnot.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight