Understanding the arguments to an Apache module directory config merge function

Understanding the arguments to an Apache module directory config merge function - c

I am writing an Apache module and I have run into some confusion regarding the behavior of the directory configuration merge function.
In the source for mod_example, the parameters are labelled like this:
static void *x_merge_dir_config(apr_pool_t *p, void *parent_conf, void *newloc_conf);
But given directives like this:
<Location /path/>
MyOption value-from-path
</Location>
<Location />
MyOption value-from-root
</Location>
When this function is called as a result of an access to http://localhost/path/, the function is called with parent_conf coming from /path/ and newloc_conf coming from /, which is exactly the opposite of what I would expect based on the names of these parameters. I would describe "/" as the parent and "/blog/" as the child/subordinate/most specific path.
I'm trying to figure out what the real story is here. Is Apache using the word "parent" differently than I do? Has mod_example erroneously misnamed these parameters? Am I simply confused?

As I understand it the "parent"/"new" args have more to do with the order in which Apache applies configuration directives than the pathnames associated with those directives. As you know it starts with the per-server, then vhost, then Location sections, etc., with later directives in this list overriding earlier ones. And the job of the merge callback is to compute the result of overriding the stuff in "parent" with the stuff in "newloc".
If multiple Location sections match a request, they are applied in the order they appear in the config file. Therefore, when MyOption value-from-root is merged in, it is the "new" piece of configuration— it's being applied after value-from-path— so it is in the "newloc" arg, and the configuration computed up to that point, including value-from-path, is in "parent".
(As an aside, unless your module has an unusual merge behavior, this means you probably want to swap the order of the two Location sections. The second one, for root, will always apply, and will presumably prevent the first one from ever having a visible effect.)

Related

Apache Camel: how to isolate logic into testable atomic routes

Let's consider the following use case:
a set of providers pushes data in a corresponding directory on a local server (e.g. P1 pushes data into data/P1, P2 into data/P2, etc.)
each provider has its own generation rules (e.g. P1 generates plain txt files, P2 generates archives, P3 generates encrypted files, etc.)
on a Spring Boot application running on the server, each provider has its own Camel route which, every 10 minutes, reads from the corresponding directory (e.g. R1 reads from("file:data/P1"), R2 reads from("file:data/P2"),
etc.)
the given provider can also combine rules (e.g. P4 generates archives containing encrypted data)
depending on the route, read data is then processed accordingly in order to move plain txt files to a target directory (e.g. R2 unzips data and moves it, R4 unzips data, decrypts the extraction result and moves it, etc.)
As soon as more routes are implemented, it immediately appears obvious that most of the code is duplicated and can be extracted; in fact, since rules can be combined, each data elaboration could be seen as an atomic operation, available for the given route.
Let's consider, for instance, the following atomic operations:
unzip
decrypt
move
So, here's how those routes could look like:
R1
from("file:data/P1")
.to("file:destination")
R2
from("file:data/P2")
// UNZIP LOGIC HERE
.to("file:destination")
R3
from("file:data/P3")
// DECRYPT LOGIC HERE
.to("file:destination")
R4
from("file:data/P4")
// UNZIP LOGIC HERE
// DECRYPT LOGIC HERE
.to("file:destination")
Since I want to extract common logic, I see two main options here (with corresponding R4 resulting code):
extract the logic into a custom component
from("file:data/P4")
// FOR EACH FILE
.to("my-custom-component:unzip")
.to("my-custom-component:decrypt")
.to("file:destination")
extract the logic into smaller routes
from("file:data/P4")
// FOR EACH FILE
.to("direct:my-unzip-route")
.to("direct:my-decrypt-route")
.to("file:destination")
(of course, this is a super simplification, but it's just to give you the big picture).
Between those two options, I prefer the latter, which allows me to quickly reuse Camel EIPs (e.g. unmarshal().pgp()):
from("file:data/P4")
.to("direct:my-unzip-route")
.to("direct:my-decrypt-route")
.to("file:destination");
from("direct:my-unzip-route")
// LOGIC
.unmarshal().zip()
// MORE LOGIC
;
from("direct:my-decrypt-route")
// LOGIC
.unmarshal().pgp()
// MORE LOGIC
;
First question: since the given sub-route changes the original set of files (e.g. unzip could transform one archive into 100 files), would it be better to use enrich() instead of to()?
Second question: what am I supposed to route between those sub-routes? Right now, due to implementation details not explained here for simplicity, I'm routing a collection of file names so, instead of the from("file:data/P4") I have a from("direct:read-P4") which receives a list of file names as input and then propagates this list to the given sub-routes; each sub-route, starting from the list of file names, applies its own logic, generating new files and returning a body with the updated list (e.g. receives {"test.zip"} and returns {"file1.txt", "file2.txt"}).
So, the given sub-route looks like this:
from("direct:...")
// FOR EACH FILE NAME
// APPLY TRANSFORMATION LOGIC TO THE CORRESPONDING FILE
.setBody( // UPDATED LIST OF FILE NAMES )
Is it correct to end a route without a producer EIP? Or am I supposed to end it with the to() which generates the given new file? If so, it would be mandatory for the next sub-route to read all data again from the very same directory, and that doesn't seem so optimized, since I already know which files have to be taken into consideration.
Third question: supposing it's ok to let the given sub-route transform data and return the corresponding names list, how am I supposed to test it? It wouldn't be possible to mock the ending producer, since I'm not completing the route with a producer... so, do I have to use interceptors? Or what else?
Basically, I'm making this question because I have a perfectly working set of routes but, during tests creation, I've noticed that some of them are unnatural to test... and this could easily be the result of a wrong design.

enrich is used if you want/need to integrate external data into your route
The example below illustrates how you could load the actual list of file types from an external file, where the aggregation strategy will add the file types to the exchange header. Afterwards a split is performed on the enriched header and the exchange directed to the respective route (or an error is thrown if no route is available for the file type). This route makes use of the enrich EIP, the split EIP and content based routing.
from(...)
.enrich("file:loadFileList", aggregationStrategy)
.split(header("fileList").tokenize(","))
.to("direct:routeContent");
from("direct:routeContent")
.choice()
.when(header("fileList").isEqualTo("..."))
.to("direct:storeFile")
.when(header("fileList").isEqualTo("..."))
.to("direct:unzip")
.when(header("fileList").isEqualTo("..."))
...
.default()
.throw(...)
.end()
from("direct:storeFile")
.to("file:destination");
from("direct:unzip")
.split(new ZipSplitter())
.streaming().convertBodyTo(String.class)
.choice()
.when(body().isNotNull())
.to("file:destination")
.default()
.throw(...)
.end()
.end()
The corresponding aggregation strategy might look something like this:
public class FileListAggregationStrategy implements AggregationStrategy {
#Override
public Exchange aggregate(Exchange original, Exchange resource) {
String content = resource.getIn().getBody(String.class);
origina.setHeader("fileList", content);
}
}
Note however, that I have not tested the code myself. I just wrote down a basic structure mostly on top of my head.
Is it correct to end a route without a producer EIP
AFAIK Camel should "tell you" (in the sense of an error) that no producer is available when you attempt to load the route. A simple .log(LoggingLevel.DEBUG, "...") however is enough to stop Camel complaining if I remember correctly (haven't worked with Camel in a while now).
You always have the possibility to weave certain routes using AdviceRouteBuilder and modify routes to your needs. The intercepter itself also acts on certain route definitions, such as .to(...) i.e. The easiest way to test routes is to use MockEndpoints and replace the final producer calls (i.e. .to("file:destination") with your mock endpoint and perform your assertions on it. In the above mentioned sample you could i.e. only test the final unzipping route by providing the ProducerTemplate with the respective ZIP archive as body and replace the .to("file:destination") with a mock endpoint you perform assertions against. You could also send a single ZIP archive to the direct:routeContent and pass along a fileList header with the respective name so that the routing should invoke the unzip route. Here you could either reuse the above modified direct:unzip route definition and thus also check whether the archive could be unzipped or you could replace the .to("direct:unzip") route invocation with an other mock endpoint and such.

Accessing environment variables in Clear Case configuration specification

Is it possible to access environment variables in the config spec in clearcase.
I have this code:
element /folder/... /main/current_branch/LATEST
I wish to set my development up so as that I can update the branch by simply setting some envvars. I would like somthing like this to work is it possible?
element /folder/... /main/$current_branch/LATEST
where $current_branch should return the current branch set in that environment variable.

AFAIK, that is not possible.
The way I handle that is by having templates that I fill in (automatically). But I also use separate views; views are disposable and I rebuild my views routinely (every week, every couple of weeks, sometimes a few times in a day if I need to be sure of the cleanliness of the builds).
I'd show you my scripts but there are a large number of them, and they're fairly intricately intertwined with each other and with the working environment we have (multiple but overlapping VOBs for each of a number of major versions of a number of products, with some parts of the config spec provided by CM and custom preambles to identify what I'm working working on). We've been using ClearCase for about 18 years now.
The net result is a config spec for a bug fix branch that looks like:
# #(#)$Id:243260.jleffler.toru.cs,v 1.1 2011/08/30 15:23:02 jleffler Exp $
#
# Config Spec for Bug 243260 - Blah, blah, blah, blah
element * CHECKEDOUT
element * .../TEMP.243260.jleffler/LATEST
mkbranch -override TEMP.243260.jleffler
#time 26-Jul-2009.00:00:00UTC-08:00
element /vobs/main_vob/... /main/LATEST
element /vobs/other_vob/... dist.1.00 -nocheckout
include /atria/cspecs/product/1.23/product-1.23.4
#include /atria/cspecs/product/1.16/product-1.16.8
element * /main/LATEST
The bit between the commented out time stamp and the catch-all rule is provided by CM. The bit above the time stamp is custom to the branch (TEMP.243260.jleffler — which identifies it as a temporary branch, the bug fix which it is for, and who is doing the work). The template actually lists about 10 different config specs from CM, and I just delete the ones that aren't relevant. The view name is based on the bug number, my login, and the machine where it's created (toru). I've disguised most of the rest, but it is based on a bug cspec that I created earlier today. My bug.view script took the bug number, a description, the path for the view working storage, and the VOBs where I needed the branch created and went off and set everything up automatically. (And I'm still archaic enough to use RCS to keep my cspecs under control.)
Some of my views last a long time (by name). For example, the current release reference view will survive for the 5+ years that the release will be supported. It'll be rebuilt hundreds of times over that period, but the name remains the same: prod-1.23-ref.jleffler.toru. So the cspec for that will change over time, as different work is needed, but the basic cspec is three lines — CHECKEDOUT, include standard CM provided configuration file, and LATEST.

No, I never saw a config spec based on environment variable.
I looked at config_spec man page, writing config spec and "How config spec works": none refer to that possibility.
For dynamic view, I saw script modifying the config spec dynamically, based on an environment variable, using cleartool setcs (since the refresh would be near instantaneous with a dynamic view).
Note: don't forget that your current_branch might not always derive directly from /main. I prefer using the syntax:
element /folder/... .../my_branch/LATEST
in order to select my_branch, without depending on its direct "parent" branch (even though, in base ClearCase, there is no real "parent" branch).

What's the difference between "direct:" and to() in Apache Camel?

The DirectComponent documentation gives the following example:
from("activemq:queue:order.in")
.to("bean:orderServer?method=validate")
.to("direct:processOrder");
from("direct:processOrder")
.to("bean:orderService?method=process")
.to("activemq:queue:order.out");
Is there any difference between that and the following?
from("activemq:queue:order.in")
.to("bean:orderServer?method=validate")
.to("bean:orderService?method=process")
.to("activemq:queue:order.out");
I've tried to find documentation on what the behaviour of the to() method is on the Java DSL, but beyond the RouteDefinition javadoc (which gives the very curt "Sends the exchange to the given endpoint") I've come up blank :(

In the very case above, you will not notice much difference. The "direct" component is much like a method call.
Once you start build a bit more complex routes, you will want to segment them in several different parts for multiple reasons.
You can, for instance, create "sub routes" that could be reused among multiple routes in your Camel context. Much like you segment out methods in regular programming to allow reusability and make code more clear. The same goes for sub routes using, for instance the direct component.
The same approach can be extended. Say you want multiple protocols to be used as endpoints to your route. You can use the direct endpoint to create the main route, something like this:
// Three endpoints to one "main" route.
from("activemq:queue:order.in")
.to("direct:processOrder");
from("file:some/file/path")
.to("direct:processOrder");
from("jetty:http://0.0.0.0/order/in")
.to("direct:processOrder");
from("direct:processOrder")
.to("bean:orderService?method=process")
.to("activemq:queue:order.out");
Another thing is that one route is created for each "from()" clause in DSL. A route is an artifact in Camel, and you could do certain administrative tasks towards it with the Camel API, such as start, stop, add, remove routes dynamically. The "to" clause is just an endpoint call.
Once starting to do some real cases with somewhat complexity in Camel, you will note that you cannot get too many "direct" routes.

Direct Component is used to name the logical segment of the route. This is similar process to naming procedures in structural programming.
In your example there is no difference in message flow. In the terms of structural programming, we could say that you make a kind of inline expansion to your route.

Another difference is Direct component doesn't has any thread pool, the direct consumer process method is invoked by the calling thread of direct producer.

Mainly its used for break the complex route configuration like in java we used to have method for reusability. And also by configuring threads at direct route we can reduce the work for calling thread .

from(A).to(B).to(OUT)
is chaining
A --- B --- OUT
But
from(A ).to( X)
from(B ).to( X)
from( X).to( OUT )
where X is a direct:?
is basically like a join
A
\____ OUT
/
B
obviously these are different behaviours, and with the second you could implement anylogic you wanted, not just a serial chain

Is there a way to determine the loading order of apache modules

I'm developing an Apache based application witch few custom modules.
I'd like to share some functionality in one module with others.
I need to wire them together during stratup phase.
I want to use GetModuleHandle + GetProcAddress (it will rununder Windows only) with a module name - but this will succeed only if the module is already loaded by Apache server.
Is there a way to configure the loading order of Apache modules.
I only need to control my modules - others are irrelevant.
Thank's in advance.

If you're trying to control the Apache hook calling order from the source of your module, you can try using APR_HOOK_FIRST, APR_HOOK_MIDDLE, and APR_HOOK_LAST. Or you can specifically name other modules to enforce ordering constraints. From the docs:
... "There are two mechanisms for doing this. The first, rather crude, method, allows us to specify roughly where the hook is run relative to other modules. The final argument control this. There are three possible values: APR_HOOK_FIRST, APR_HOOK_MIDDLE and APR_HOOK_LAST.
"All modules using any particular value may be run in any order relative to each other, but, of course, all modules using APR_HOOK_FIRST will be run before APR_HOOK_MIDDLE which are before APR_HOOK_LAST. Modules that don't care when they are run should use APR_HOOK_MIDDLE. These values are spaced out, so that positions like APR_HOOK_FIRST-2 are possible to hook slightly earlier than other functions. ...
"The other method allows finer control. When a module knows that it must be run before (or after) some other modules, it can specify them by name. The second (third) argument is a NULL-terminated array of strings consisting of the names of modules that must be run before (after) the current module. For example, suppose we want "mod_xyz.c" and "mod_abc.c" to run before we do, then we'd hook as follows ..." [example follows]

preprocess Vs. process functions in drupal template

What is the difference between
function mythemes_preprocess_html(&$variables) { ... }
and
function mythemes_process_html(&$variables) { ... }
in drupal 7 template.php.
when must use preprocess functions and when must use process functions.
thanks.

They're effectively the same thing albeit called in different phases. Preprocess functions are called first and changes are made. Process functions are then called at a later phase and allow for changes to be made to alter any modifications introduced during the preprocess phase.
See http://drupal.org/node/223430 for more information.

More exactly, from Drupal API documentation:
If the implementation is a template file, several functions are called before the template file is invoked, to modify the $variables array. These fall into the "preprocessing" phase and the "processing" phase, and are executed (if they exist), in the following order (note that in the following list, HOOK indicates the theme hook name, MODULE indicates a module name, THEME indicates a theme name, and ENGINE indicates a theme engine name): (source: http://api.drupal.org/api/drupal/includes!theme.inc/function/theme/7)
And if you follow the link above, it will list, in order, the entire theme() progression, from process functions to preprocess functions to the template file itself.

Which stage of process do you want to affect, for this there are two options:
Preprocess function: It runs first.
Process function: Runs after all the preprocess functions have been
execute.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight