How to make Sphinx resolve URL links from e.g. /about-manual to /about-manual.html - url-routing

I'm creating a Sphinx documentation and I struggle to identify the "proper" way to set up the structure and links.
STRUCTURE 1#
Currently, my structure looks as below:
index.rst
about-manual/index.rst
Inside my root index.rst, the toctree is as below:
===========================
Contents
===========================
.. toctree::
about-manual/index
This results in the below links:
https://example.com/docs/ --> Content of index.rst
https://example.com/docs/about-manual --> Content of about-manual/index.rst
This works as intended in terms of link resolution
However, I'm unsure if it's the "proper" way of setting up my Sphinx structure
STRUCTURE 2#
index.rst
about-manual.rst
Inside my root index.rst, the toctree is as below:
===========================
Contents
===========================
.. toctree::
about-manual
This results in the below links:
https://example.com/docs/ --> Content of index.rst
https://example.com/docs/about-manual --> ERROR
https://example.com/docs/about-manual.html --> Content of about-manual.rst
This results in a more compact/simple Sphinx structure
However, if a user enters an URL without explicit .html at the end, the link is broken
Am I missing a basic configuration setting in Sphinx in order to make the link resolution work as per my expectation with 'STRUCTURE 2#' - without having to add the explicit .html at the end?
And is it possible to avoid having the Sphinx documentation explicitly resolve to the index.html at the end of an URL path? It does this as expected on the root index, but in 'STRUCTURE 1#' all sub pages explicitly show the index.html at the end.
I've looked at html_file_suffix and html_link_suffix, but I've not been able to make these work for my purpose either.

I found a solution for this by using sphinx-build -b dirhtml instead of sphinx-build -b html (as proposed by Jesse Tan from the sphinx_rtd_theme team) - for details see this link.
It basically lets me use the approach in 'STRUCTURE 2#', but it builds the output with an index.html files structured in folders named as per the .rst files.
Importantly, the internal links are also updated to not include the index.html part.

Related

Split single file to multiple "posts" in Hugo?

Hugo's one-file->one-page model is nice and simple, especially for a blog. But sometimes you want to write an "article" for a blog and split it into 2 or more pieces for separate pages (perhaps to be posted on separate days, or whatever).
Is there a way to do this within Hugo? Perhaps a combination of something you put in a layout/theme/shortcode and internal markup within the page (to signal where to split the article)?
Possible models might include:
1 input post "splits" into 2/3/4 posts when the site is built to public
1 input post is duplicated into 2/3/4 posts when the site is built to public but somehow each duplicate isn't an exact duplicate but instead has the whole post but certain parts of the post are hidden/invisible, via CSS, such that they represent the 2/3/4 "pages" of the post.
Or, is this something you do external to Hugo?
UPDATE: I can see I need to clarify this. Consider this random illustrative blog post - it is the third of three closely related posts, and even has a set of links at the top so you can find the earlier posts in the series. Lots of technical blogs do this sort of thing (at least the ones I read).
Now, I'm not looking for a CMS or anything complex. What I do now with Hugo is hugo new posts/an-article-about-constexpr.md and I write one markdown file and it becomes one "post" in standard Hugo fashion. Exactly what you want a SSG to do.
What I want to do is write one markdown file but have some kind of markup in it separating it into sections (like <!-- More --> on steroids) so that instead of generating one page of my site it generates three (in this example) - three separate articles with links from the main page in the "posts" section, etc. etc. And for bonus points, I'd like to generate these "table of contents" sections with links to each of the pages.
So I've been doing that with a cobbled-up awk script that generates pages right next to the post, in the posts directory. I set the post to draft so it doesn't get published, but the pages generated by the awk script have draft=false so they do get published. And the dates get set so they're "in order".
And that's working, but before I invest more time in my little script, I wanted to see if there was a proper way to do this within hugo.
Not sure what you mean by one-file->one-page model.
I have very few parts of any hugo site which one markdown file=one rendered html page.
Could just be the way I build, but everything I've done so far has been vanilla hugo.
To answer your question: Yes, you are correct that would work. There a few ways to do this (I list one below), but maybe a deeper look would be separating the concept of a "tool-chain" and what Hugo is in that tool chain, from a CMS, which Hugo is not.
So, to possibly answer your specific question though:
You can store content in markdown, markdown front matter, or a Data form (XML/JSON) in hugo. Using the page resources {{ .GetPage }} you can access any content and load it in any template or using shortcodes, load it in other markdown.
If I needed to do this as part of a tool chain, i.e. use specific markdown and re-use it in multiple places, I would create a front matter variable, or taxonomy or tag depending on what groupings I needed where, so this was scalable. params such as
"articleAuthor: Jessie P."
"date: DATE HERE"
"tags: etc. etc."
Then lets say I know that's going to be a blog, well fine, then it will be in the corresponding content folder, but if I needed all of Jessie's articles, or articles on that date, or that specific article, I would use the shortcode I make or directly in a template, using .GetPage Match - import the markdown pages I need based on the parameters I need.
But on the other hand, I would need to understand the problem being solved, but, here are a few hugo docs to help you out:
https://gohugo.io/functions/getpage/#readout
https://gohugo.io/content-management/page-bundles/
Remember, Hugo is not a CMS, it is a site generator. If you want a CMS, you can always use Wordpress headless, or any other solution out there.
(off the top of my head using page bundles)
{{ $headlessBundle := .Site.GetPage "/blogs/specific-blog/index" }}
{{ with $getContent := $headlessBundle.Resources.Match "intro.md" }}
{{ (index $getContent 0).Content }}
(You would use various "Where" statements to "filter" content based on the params or however you delineate what you want).
Or for instance if I wanted only the text that had an H1 tag:
{{ $.Scratch.Set "summary" ((delimit (findRE "(<h1.*?>.*?</h1>\\s*)+" .Content) "[…]") | plainify | replaceRE "&" "&" | safeHTML) }}
{{ $.Scratch.Get "summary" }}
Based on the update to the question:
https://discourse.gohugo.io/t/split-markdown-content-in-two-files-but-dont-render-shortcodes-as-raw-text/32080/2
https://discourse.gohugo.io/t/getting-a-list-from-within-a-shortcode/28126
https://discourse.gohugo.io/t/splitting-content-into-sections-based-on-header-level/33749
https://discourse.gohugo.io/t/multiple-content-blocks-on-a-single-page/9092/3
jrmooring answered it best in the above with clear examples and code.
Though, note: If I was doing this in a technical blog this would be integrated into the CMS and coordinated with the builder.

Static alias to auto generated section URLs with ReST / Sphinx

I have a web application with some "Help" buttons which point to my online documentation. The links to the help sections are hardcoded in the app database. Previously, the documentation was made in HTML and JS, and I could control the URLs to the section manually.
However, now that I am migrating to Sphinx and ReST, I found the automatic section URL generation great, but cannot figure out how to control this behaviour for my structure.
Is there a way to have a sort of URL alias which point to the actual URL of my documentation in order not to update the hardcoded links in the app db everytime I update the name of my chapters/sections?
For instance:
I have a subsection called "I like apples" in Chapter 1.
My hardcoded link to it would be something like
"Chapter1#I-like-apples" (I only care about the part following the #
sign)
I change the title to "I hate apples". The new link would become "Chapter1#I-hate-apples", but in my db I still need to have "#I-like-apples" which point to the same section.
See Hyperlink Targets in the docutils documentation, specifically "internal hyperlink targets".
.. _my-target:
.. _synonym-to-my-target:
My Subsection
-------------
Sphinx will generate targets for each synonym.
You could also do indirect hyperlink targets.
.. _my-target: synonym-to-my-target_
.. _synonym-to-my-target:
My Subsection
-------------

cakephp Managing Plugin Views how to determine paths

I have a plugin installed that has its own layout overrides for different controllers. However I'm having trouble understanding the mechanism for modifying the paths.
In the plug-in controller if I tell it to use my layout
$this->layout = 'default_dashboard';
Which is in app/Views/Layout and references an image in app/webroot/default_images.
All the relative links work fine to default_images when I do this, but would like to use some of the Plugin template overides for other actions.
However if I modify the default.cpt file to include some of the images, like say a logo that is used in default_dashboard.ctp. It is unable to map to the same image location.
For example in default.ctp:
echo $this->Html->image('default_images/logo.png',array('alt' =>
'Logo','width'=>'284','height'=>'82'));
produces a path to /img/default_images/logo.png. The Plugin is configured to use the /img location, whereas I want to direct to /default_images in this case. I could make this ../default_images/logo.png, but this isn't very clean.
In addition I have js and css which is having a similar problem. Can someone please explain the mechanism for using a site-wide default.ctp so that it works with inherited plugin templates?
From hard coding the links into the template not using the Html Helper, I see that the browser's relative path is confused because of the routing. For example the first one works with the root specified, the second doesn't.
<img src="/default_images/logo.png" alt="works" width='284' height='82'>
<img src="default_images/logo.png" alt="lost" width='284' height='82'>
What's the best way to make sure that the Plugin layouts and non-plugin layouts can all find the correct path to /default_images ?
Following are the steps that you can follow to resolve relative path problem:
Create a file abc_constants.php in app\Config folder.
Include the file in app\Config\bootstrap.php
require_once(abc_constants.php);
abc_constants.php should contain:
define('HTTP_HOST', "http://" . $_SERVER['HTTP_HOST'].'/');
define('SITE_URL', HTTP_HOST.'your_app_name/');
define('IMAGE_HTTP_PATH', SITE_URL.'app/webroot/default_images/');
Use these constants in your view file accordingly.
<?php echo $this->Html->image(IMAGE_HTTP_PATH.'logo.png',array('alt' => 'Logo','width'=>'284','height'=>'82'));
It looks a bit lengthy process at first time, but once implemented, you can use these constants in Ajax calls in view files, controller's code etc.

How to alter the prefixes EPiServer is adding to src attributes in html

I have a fragment of html which is contained in a property of a templated EPiServer page, within that html there is an img tag which has a relative url in it.
When the page is viewed, I can see the src attribute of the tag has been altered to have the prefix /ProjectName/Templates/Pages/.
I understand that this is being done by HtmlRewriteToExternal so that image files that are stored alongside the aspx template (which does indeed live in Templates\Pages) are located correctly, however the image which is intended to be part of the html fragment is in my case actually stored under PageFiles/nnn/ (where nnn is actually the parent page's PageFolderID), and I need to somehow make the altered html reflect that.
I've created a class that inherits from FriendlyUrlRewriteProvider and registered my class. I can debug the application, and watch the requests go through the overridden methods, but I still can't see where the prefix is being added or get any idea how to change it. I can alter the src tag to a different relative path in my class, but the prefix is still being added.
I've read everything I can find on the EPiServer url rewriting, but can't find anything that hints as to where this prefix is being added or how to stop that or change it.
Things I've read:
http://blogs.interakting.co.uk/post/File-Extensions-and-URL-Rewriting-in-EPiServer.aspx
http://blog.fredrikhaglund.se/blog/2008/05/07/disable-episerver-urlrewriter-interference/ (this may contain the answer I'm looking for)
http://labs.kaliko.com/2010/11/prevent-episerver-urlrewrite.html
http://sourcecodebean.com/archives/episerver-friendly-urls-for-paginated-pages-and-why-the-asplinkbutton-must-die/510
http://tedgustaf.com/en/blog/2008/7/create-a-custom-url-rewrite-provider-for-episerver/
http://tedgustaf.com/en/blog/2011/4/publishing-plain-html-pages-in-episerver/
http://sdk.episerver.com/library/cms5/Developers%20Guide/Friendly%20URL.htm
http://sdk.episerver.com/library/cms6.1/html/T_EPiServer_Web_UrlRewriteModule.htm
http://labs.episerver.com/en/Blogs/Ruwen/Dates/111218/112064/112154/
http://world.episerver.com/Blogs/Magnus-Strale/Dates/2011/3/Do-we-really-need-yet-another-HTML-parser/
http://world.episerver.com/Blogs/Yugeen-Klimenko/Dates/2011/6/How-EPiServer-URL-Rewriting-works/
http://world.episerver.com/Modules/Forum/Pages/Thread.aspx?id=46869
I'm open to completely different solutions for what I'm actually trying to achieve, which is as follows:
I have multiple independent sets of static html files and related image / css / js files, which I'm trying to store / publish with EPiServer. The structure of each set looks something like
setfolder/
htmlfileA.html
htmlfileB.html
css/
styles.css
images/
piccy1.png
piccy2.png
js/
magic.js
I've figured that I should create an EPiServer page for the set, and then child pages for each html file, storing the html from the files in a property of the child pages. Currently I'm storing the related static files in the PageFiles of the relevant setfolder page, as that seems to be the most logically consistent place to put them.
It's hard to give the best solution without seeing it all infront of you. But one easy way is to alter the HTML-code when you print the property to the page.
Like <%= ChangeRelativeLinks(CurrentPage["HtmlCode"] as string) %>
And in the ChangeRelativeLinks(string htmlCode) you do a regexp or similar that changes relative links and images to the pagedir as an absolute path.
If you are storing the images in PageFiles which is a Virtual Path Provider you should be able to get the url to your file simply by using the API. On the PageData class (ie CurrentPage in your template) you have a method called GetPageDirectory() which gets the page folder.
You can read more about VPP concepts here:
http://sdk.episerver.com/library/cms6.1/Developers%20Guide/Core%20Features/File%20System/File%20System%20and%20VPPs.htm
No need for a url rewrite provider for this I think.

Jackrabbit XPath Issue

I'm relatively new to Jackrabbit. In our application we never turned on SearchIndex section within repository.xml (so as workspace.xml) files because we always go directly to a given document using the JCR UUID reference. We are using Jackrabbit v2.2.1 and Oracle as the repository. Now our requirements are getting expanded as we would like to use the document metadata feature to store contextual info about a document so that we can use the metadata to retrieve a selected set of documents.
As the first step, I added the default SearchIndex section in workspace.xml file and restarted the JCR.
I saw a bunch of lines like this in my log file - then I saw it created the index folder under workspace area.
2011-07-05 15:04:01.724 INFO [WebContainer : 0] MultiIndex.java:1204 indexing... /vfs:metaData/21ee130e-978e-415f-bfd1-7aa03d91608c/vfs:attributes (3500)
I have the folder structure like this. When I create a document in JCR, I specify the metadata info as part of the document which is by a complex XSD type with tags like docType, uploadedBy, contextValue, etc.
/ (root)
/MyApp (sub-folder)
/documents/ (sub-folder)
/document-1.pdf (file)
/document-2.pdf (file)
/accounts/ (sub-folder)
/account.txt (file)
etc...
The following XPath expression works.
//jcr:root/vfs:metaData//*[vfs:attributes/vfs:docType='TAX_DOCS']
If I give wrong value, for example instead of 'TAX_DOCS', 'TAX', it returns no documents as expected which is great. This proves that the metadata is correctly stored as expected and it is used in the filter process correctly.
The problem with this query is that it starts searching from the root folder but I want to search from /MyApp/documents sub-folder only. So I tried this:
//jcr:root/MyApp/documents//vfs:metaData//*[vfs:attributes/vfs:docType='TAX_DOCS']
It returns nothing. Then I tried this too but no success.
//jcr:root/MyApp/documents//*[vfs:metaData/vfs:attributes/vfs:docType='TAX_DOCS']
So what am I doing wrong? Is anything in workspace.xml configuration that we need to set or missing?
Any help is appreciated.
Thanks, Jack
Drop the double slashed from anything but the last path component and use the # notation for the attribute value, resulting in:
/jcr:root/MyApp/documents//*[vfs:attributes/#vfs:docType='TAX_DOCS']
The // construct looks for the whole subtree instead of just the immediate children like / does. The JCR specification only requires implementations to support the // construct as the last step of the XPath query.

Resources