Wagtail 4.x - how to run code on save draft/publish/moderate (not clean)? - wagtail

Wagtail 4.x question:
On some page models, I have some calculated fields. I run some time/processor intensive methods to return and store the results based on parameters in the page set by the editor on save (publish, draft or moderate).
On previous versions of Wagtail, I'd kick these off from the clean() method of page or form.
With 4.x, the preview is constantly loading, even when hidden, so it's tripping the clean method continuously.
My idea was to move this off to an after_create_page / after_edit_page hook with the following format:
def get_calculated_fields(request, page):
if page.specific_class == SomePage:
page.somefield = page.get_somefield_value()
if page.has_unpublished_changes:
except Exception as e:
This seems to work as I want, it only kicks in when one of the actions I want is triggered, and only after the page form has passed the clean(). Save draft saves as revision, publish saves published page.
My concern is if that page save/save_revision logic is sound though or if there are circumstances that this would be a bad idea.
I haven't found anything in the docs about dealing with this. Anyone have experience in 4.x of this and confirm this is the way or if there's a better way of handling this?
Edit: There is the logic in Wagtail docs regarding after_create_page to save the revision then publish if live, but this won't work for existing pages. Following this code, if you save draft on an existing live page, it would publish the draft immediately.


React app hosted by Netlify doesn't update unless F5 or reload

I'm a little surprised there is nothing out there about this that I have found. But just like the title says, I have a React SPA deployed to Netlify. It goes live without error. The only issue is, if the end user has been to the site before, they have to refresh the page to see any changes I have pushed out.
Is there something I need to add to the index file perhaps?
The browser caches the compiled js bundle.
You can read more here: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control
One of your options would be to disable it, or set cache expiration to a lower value during the intense development and increase it if/when you deploy less often.
Another option could be to implement some kind of API method to check if newer version has been deployed and trigger a page refresh. (Please be careful not to discard users work, like data filled in forms, during a refresh)
Each have pros and cons.

Cache issues: React + REST server behind CDN

I am looking for a pattern that would allow me to better the UX for my users. I have a REST server running behind CloudFront being consumed from a plain React application on the frontend.
I'll simplify my example to illustrate my issue.
I have an endpoint called GET /posts/<id>. When the browser asks for it, it comes with a max=age=180 which means it would get stored in the browser's cache and any subsequent call to GET /posts/<id> will be served from the browser's cache for the duration of those 180 seconds, after which it will hit the CDN again to try and obtain a fresh copy.
That is okay for most users. I don't mind if updates to any post to delay up to 3 minutes before they're propagated to all the users. But there is one user who's the author of this post. That user can make changes to this post using PATCH /posts/<id>. Let's call that user The Editor.
Here's a scenario I have right now:
The Editor loads up the post page which then calls GET /posts/5
The CDN serves the latest copy to the front end.
the Editor then makes a change to the post and submits it to be back end via PATCH /posts/5.
The editor then refreshes his browser tab using Command-R (or CTRL-R).
As a result, the front end then requests GET /posts/5 again -- but gets the stale copy from before the changes because 180 seconds haven't passed yet since the last GET and the GET issued after the PATCH
What I'd like the experience to be is:
The Editor loads up the post page which then calls GET /posts/5
The CDN serves the latest copy to the front end.
The editor then makes a change to the post and submits it to be back end via PATCH /posts/5.
After a Command-R browser tab refresh the GET /posts/5 brings back a copy of the data with the changes the editor made with PATCH right away, regardless of the 180 seconds of ttl before a fresh copy can be obtained.
As for the rest of the users, it's perfectly okay for them to wait up to 180 seconds before the change in the post propagates to them when the GET /posts/5
I am using Axios, but I do not that SWR and React-Query support mutations. To my understanding this would allow the editor to declare a mutation for the object he just PATCH'ed on the server, so that any subsequent calls he makes to GET /posts/5 will be served from there, until a fresher version can be obtained from the backend.
My questions are:
Can SWR with "mutations" serve the mutated object via the GET /posts/5 transparently?
Will the mutation survive a hard browser tab refresh? or a browser closure, re-opening and subsequent /GET posts/5?
Is there another pattern/best practice to solve that?
TL;DR: Just append a harmless, gibberish querystring to the end of the request GET /posts/<id>?version=whatever
Good question. I must admit I don't know the full answer to this problem, but I want to share one well-known technique among frontend devs.
The technique is called cache busting. I'm not sure if this is the best practice, but I'm pretty sure it's widely practiced, since it's so straight-forward to understand.
Idea is simple. When you add a changed querystring to the end, you effectively change the URL, thus no cache is hit, you evade the whole cache problem.
So the detail steps to a solution for your particular use case would go like this:
Normally you'll just request GET /posts/<id> for all users
When a user logs in, a hash key is generated from whatever algorithm. For simplicity let's just use increasing integer and call it version. You store this version in localStorage so it can survive through page refresh.
Now you need to distinguish scenario when the user is viewing his own posts or other's posts. When guy is viewing his own, you always use GET /posts/<id>?version=n
Whenever the user edits his post and hits save button, you bump version from n to n+1
Next time he goes to post view page, the app requests GET /posts/<id>?version=n+1 which is not cached, and would retrieve the up-to-date content.
One last thing, make sure your server safely ignores that ?version=n querystring.
I'm sure there're other solutions to this problem. I'm no expert of server config and HTTP headers so I'm not getting into that topic, but there must be something to look for.
As of pure frontend solution, there's Serivce Worker API for you to consider. The main point of this API is to enable devs to programmatically control cache strategies.
With this API, you could leave your current app code as-is, just install a service worker, then you could use the same cache busting technique in the background to fetch new content, or just delete the cache (using Cache API) when user edits, or even fake a response for the GET /posts/<id> from the PATCH /posts/<id> that user just send.
Depending on what CDN you use, you can invalidate a cache manually when publishing updates to a post. For example cloudfront lets you specify which path you want to fetch fresh on the next request.
For sites with lots of traffic but few updates this works pretty well, and is quite simple to implement. For sites with a lot of authors and frequently changing content you would need to get more creative though.
One strategy I've used in the past is using a technique called object versioning, where instead of invalidating the cache to an object you just publish a version of it with a timestamp. This would also mean you need to publish a manifest file when your frontend loads. The manifest contains the latest timestamps of all the content the page needs to load, and is on a much shorter TTL than the rest of the content. When you publish a new version of a post you would update the timestamp in the manifest, and the frontend pulls the latest version of it the next time the page loads.

SPA DTM analytics pagename issue

I am using Angular SPA with DTM.Using custom event based rules, I am able to get all my data including pageName, v41,v42 as correct. Now inside adobe editor, i am storing pagename to s.pageName and some hard-coded value to s.server. I have verified that all my data is correctly populating using OMNIBUG tool as server,pageName, v41 and v42.
Problem is coming in Omniture reporting, as server and page data are not coming through. Page-name data only showing SPA homepage in all page visits and server also coming as default from s.code and not the one i am passing from s.server. eVar/prop are all coming fine.Even if I do prop40=s.pageName/prop41=s.server, then in omniutre reporting i am seeing correct data populating in prop40/prop41 but not under Page and server. And again I cant use prop40/prop41 for pagename/server as its not a correct way to follow and PAGE-VISITS are ZERO in that case.
Any help how to get data in page/server in omniture for SPA or anything wrong in my implementation? Thanks in advance!!
If you really do see the correct values in Omnibug (or more specifically, network request to Adobe collection server), then the issue is not in the code.
Check against another AA hit debugger. Possible Omnibug is somehow bugging out. There are a ton of alternatives out there. Adobe Experience Cloud Debugger. Observepoint. Charles Proxy. Fiddler. Or just use the browser dev tool network tab (what I usually do as a backup).
Make sure you are looking in the correct report suite. Perhaps your data is being sent to a dev report suite, and you are looking at prod report suite, or visa versa?
Check to see if you have any Processing Rules that are overriding your values.
Contact your Adobe Rep to check if there are any VISTA Rules present for the report suite, that are overriding your values.
If you have verified none of the above is the case, then sorry, but it sounds like the issue must really be in your code, but there is a problem with your QA method (e.g. maybe you are looking at the wrong AA request, or something).
Based on your comment:
Earlier, i was making s.tl() call, but replacing it with s.t() call
resolved my problem for data was not populating
pageName/server/page-views in Omniture and now it is. But the current
problem is we need PageName on all SPA clicks (can be achieved by
s.t() call ) , but the page-Views are not needed on all clicks. So,
its like link-tracking needed only but with PageName data. I am
struggling not to populate page-views on a s.t() call or vice-versa
how to get PageName populated on s.tl() call. Again, omnibug shows all
requests just fine but the issue comes in reports in omniture
When Adobe processes a hit, it wipes pageName for s.tl calls, as that's how it determines whether to count the request as a page view or not. If you want to see page name even for s.tl calls, the common practice is to dupe the pageName value to a prop or eVar and send in with the s.tl call, and look at that report. In fact, most clients I work with don't even use the native pages report, and instead use the (usually eVar) report.

Synchronous Call outs from Standard Pagelayout In Salesforce

There is a requirement to make synchronous call out from standard page lay outs of salesfore say for eg. standard case layout. As of my knowledge we can not make a synchronous call out in context to a standard page layout. As process builder or workflow or js button everything will work in an asynchronous context. Want to confirm that , please help with your views.
Erm... no?
Process Builder, Workflow Rule, Trigger - they all fire on event (insert, update). They're all synchronous (the whole operation waits for them to finish saving). Just viewing a record doesn't count as an event. I think you're confusing the "what triggers the action" and "what kind of action happens". Try to distinguish the two. It's like workflows - a workflow rule (condition to be met) is one thing and then what happens (field update, new task, email, outbound message) is completely different thing.
From "clicks not code" functionality Outbound Messages and emails are asynchronous - but as I said above - just viewing a record doesn't count as update, nothing will fire. Callouts done from apex (and you have several options discussed below) can be either synchronous or async. So it's whatever you want really.
You might look into:
a Visualforce page which you'll embed on the page layout (either directly or as a Chatter action for example). That page's apex controller could do the callout.
Or maybe you'd be able to do it in VF from JavaScript (without apex) - make a REST call to your url, process results... There are tons of code samples how to integrate Google Maps for example, this wouldn't be too different.
or (bit of future thinking) make a Lightning Component. You'll be able to embed it in the Lightning App Builder (the "better" page layouts) directly. in Classic SF you'd have to make it a Chatter Action or use "Lightning Out" to reuse a component in a VF page and embed that instead.
Bit more advanced (and paid extra) would be to use Salesforce Connect a.k.a Lightning Connect. Try to pass the Trailhead, easier than explaining it here. You could end up with related list of external records under the case and user wouldn't even realise this data is not hosted in SF. Connect requires OData compatible data source or a piece of Apex that acts as adapter and pretends to return valid format.
If it's about Cases - maybe you can do something fancy using the Service Cloud Console. You'd have to experiment a bit but maybe one of side panels could be a VF page or <iframe> to another system. Maybe you could make something happen with Macros...

How does React store and react to changing state exactly?

I am scraping a website that uses React for the front-end. So far it seems that I have to use their search form in order to arrive at the results page.
The problem is that the site clears out the search form's selected options from a dropdown (its state) every time the page is refreshed and therefore it makes scraping significantly slower. I know that it's working as intended, but if there was a way I could directly manipulate the state then it could speed up my scraper as opposed to re-selecting all the choices from the little buttons.
I don't think it uses any type of persistent storage or local storage at all, for every selection, otherwise the form probably wouldn't be cleared on refresh.
I can see that the years options for the form are always present in a data-attribute (data-years=["2017", "2016", ...]) but only for the years. And when a year (or any option from the dropdowns) is selected, a hidden field is populated with a value such as <input type="hidden" name="year" value="2017">.
Is this all that React uses for temporary storage (aka. state)—hidden fields?
And for the second part of my question, what type of event is fired off when there is a state change? How could I trigger it manually? When I select a year, for example, I want the form to give me the options for the next dropdown—given the year.
React does not use the DOM at all to maintain state. The example you provided is simply a poorly written React app. Normally everything will kept in memory (closured code so nothing in window/global) and React will update the DOM as she wants. :)
This means I don't think you'll be able to read/detect React instrinsic state changes from the outside. Interactive scraping should work like a user using the page, without any hint of what tech it's really using.
Depending on the technology you're using for scraping, you could indeed simulate or generate the real DOM events. When we need to write some end to end tests for a React app using the ubiquitous Selenium server, we normally have to manually click on buttons, options and so on and allow time for the React app to react accordingly and do its magic (like fetching more data and updating the page) and afterwards read document contents to verify everything was working. It's basically "scraping" with a desired output to verify, your test assertions.
If you're scraping static pages only (curl style: fetch the HTML and work your way with the original HTML response), I don't think you'll be able to handle a Javascript form. You need your scraper to be interactive.
Something like PhantomJS apart from the mentioned Selenium/WebDriver may help.
