Handling multiple build configurations in parallel - shake-build-system

How can I build one set of source files using two different configurations without having to rebuild everything?
My current setup adds an option --config=rel which will load all options from build_rel.cfg and compile everything to the directory build_rel/.
data Flags = FlagCfg String
deriving (Show, Eq)
flags = [Option ['c'] ["config"]
(ReqArg (\x -> Right $ FlagCfg x) "CFG")
"Specify which configuration to use for the build"]
main :: IO ()
main = shakeArgsWith shakeOptions { shakeChange=ChangeModtimeAndDigest }
flags $
\flags targets -> return $ Just $do
let buildDir = "build" ++
foldr (\a def -> case (a, def) of
(FlagCfg cfg, "") -> '_':cfg
otherwise -> def)
"" flags
-- Settings are read from a config file.
usingConfigFile $ buildDir ++ ".cfg"
...
If I then run
build --config=rel
build --config=dev
I will end up with two builds
build_rel/
build_dev/
However, every time I switch configuration I end up rebuilding everything. I would guess this is because all my oracles have "changed". I would like all oracles to be specific to my two different build directories so that changes will not interfere between builds using different configurations.
I know there is a -m option to specify where the database should be stored but I would rather not have to specify two options that have to sync all the time.
build --config=rel -m build_rel
Is there a way to update the option shakeFiles after the --config option is parsed?
Another idea was to parameterize all my Oracles to include my build configuration but then I noticed that usingConfigFile uses an Oracle and I would have to reimplement that as well. Seems clunky.
Is there some other way to build multiple targets without having to rebuild everything? It seems like such a trivial thing to do but still, I can't figure it out.

There are a few solutions:
Separate databases
If you want the two directories to be entirely unrelated, with nothing shared between them, then changing the database as well makes most sense. There's currently no "good" way to do that (either pass two flags, or pre-parse some of the command line). However, it should be easy enough to add:
shakeArgsOptionsWith
:: ShakeOptions
-> [OptDescr (Either String a)]
-> (ShakeOptions -> [a] -> [String] -> IO (Maybe (ShakeOptions, Rules ())))
-> IO ()
Which would then let you control both settings from a single flag.
Single database
If you want a single database, you could load all the config files, and specify config like release.destination = ... and debug.destination = ..., then rule for */output.txt would lookup the config based on the prefix of the rule, e.g. release/output.txt would look up release.destination. The advantage here is that anything that does not change between debug and release (e.g. documentation) could potentially be shared.

Related

How to build for different environments using shake-build?

Is there a built-in way to pass command-line arguments to a "shakefile"? I'd like to pass --env production|development|staging and then use it within my rules to (slightly) alter the build-steps for each environment.
There are two halves to this problem - first getting the flags into Shake, and secondly, using them to influence the behaviour.
You can get arguments into Shake using any Haskell command line parser, but Shake ships with support for one built it, which can often be easier:
data Flags = Production | Dev | Staging deriving Eq
flags = [Option "" ["production"] (NoArg $ Right Production) "Build production."
,Option "" ["dev"] (NoArg $ Right Dev) "Build dev."
,Option "" ["staging"] (NoArg $ Right Staging) "Build staging."
]
main = shakeArgsWith shakeOptionsn flags $ \flags targets -> do
want targets
... do whatever you want with the flags ...
return rules
For using the flags to influence, you might want to:
Completely isolate build outputs from each flag, in which case changing the directory and setting shakeFiles differently in each case makes each one fully distinct.
Use the flags to change the output paths, so you always have rules dev/main.js and prod/main.js, and then you consult the flags when doing want to pick up the right rules.
Put the flags into an Oracle and have them as tracked settings, so when you flip from prod to dev some things rebuild.
If the builds are 80%+ distinct I'd go for 1. If you change flags very rarely 3 can work. Otherwise, I tend to opt for 2. But they all work, so picking the simplest to start with is also not unreasonable.

Can a shake rule determine which "needs" have changed since the last build?

I am building a shake based build system for a large Ruby (+ other things) code base, but I am struggling to deal with Ruby commands that expect to be passed a list of files to "build".
Take Rubocop (a linting tool). I can see three options:
need all Ruby files individually; if they change, run rubocop against the individual file that changed for each file that changed (very slow on first build or if many ruby files change because rubocop has a large start up time)
need all Ruby files; if any change, run rubocop against all the ruby files (very slow if only one or two files have changed because rubocop is slow to work out if a file has changed or not)
need all Ruby files; if any change, pass rubocop the list of changed dependencies as detected by Shake
The first two rules are trivial to build in shake, but my problem is I cannot work out how to represent this last case as a shake rule. Can anyone help?
There are two approaches to take with Shake, using batch or needHasChanged. For your situation I'm assuming rubocop just errors out if there are lint violations, so a standard one-at-a-time rule would be:
"*.rb-lint" %> \out -> do
need [out -<.> "rb"]
cmd_ "rubocop" (out -<.> "rb")
writeFile' out ""
Use batch
The function batch describes itself as:
Useful when a command has a high startup cost - e.g. apt-get install foo bar baz is a lot cheaper than three separate calls to apt-get install.
And the code would be roughly:
batch 3 ("*.rb-lint-errors" %>)
(\out -> do need [out -<.> "rb"]; return out) $
(\outs -> do cmd_ "rubocop" [out -<.> "rb" | out <- outs]
mapM_ (flip writeFile' "") pits)
Use needHasChanged
The function needHasChanged describes itself as:
Like need but returns a list of rebuilt dependencies since the calling rule last built successfully.
So you would write:
"stamp.lint" *> \out -> do
changed <- needHasChanged listOfAllRubyFiles
cmd_ "rubocop" changed
writeFile' out ""
Comparison
The advantage of batch is that it is able to run multiple batches in parallel, and you can set a cap on how much to batch. In contrast needHasChanged is simpler but is very operational. For many problems, both are reasonable solutions. Both these functions are relatively recent additions to Shake, so make sure you are using 0.17.2 or later, to ensure it has all the necessary bug fixes.

How to override Shake configuration on the command-line

I maintain small configuration files per project read via usingConfigFile. I'd like to be able to override any of those settings on the command line. It seems using shakeArgsWith (rather than shakeArgs) is the first step on the way but I don't see an obvious way to wire that through to the values produced by getConfig. Is there a standard approach for doing this?
There isn't a standard approach, but I know several larger build systems have invented something. A combination of shakeArgsWith, readConfigFile and usingConfig should do it. Something like (untested):
main = shakeArgsWith shakeOptions [] $ \_ args -> return $ Just $ do
file <- readConfigFile "myfile.cfg"
usingConfig $ Map.union (argsToSettings args) file
myNormalRules
Where argsToSettings is some function that parses your arguments and turns them into settings - e.g. breaking on the first = symbol or similar.

Skip folders to build using scons after full build

I have large number of source files ~10,000 and they are scattered across several folders.
I wanted to know if there is a way to skip certain folders, I know that havent changed.
For ex, consider the following folder structure
A (Sconstruct is here)
|
->B (unchanged 1000 files)
->C (unchanged 1000 files)
->D (changed 1 file)
Once I do a complete build for the first time, I want it to compile everything (B, C, D) but when I modify a file in D (I know that), I would like to build folder D only, skip B and C and finally link them all together to form the final binary (B, C and new D).
I have been looking for quite some time now but not able to figure it out. Is it even possible? Can I specify only to look into a particular folder for changes?
First, I'd investigate using Decider('timestamp-match') or even building a custom Decider function. That should speed up your dependency-checking time.
But to answer your specific question, yes it is possible to not build the targets in B and C. If you don't invoke a builder for the targets in those subdirectories, you just won't build them. Just have an if that selectively chooses which env.Object() (or similar) functions to invoke.
When I fleshed out your example, I chose to have each subdirectory create a library that would be linked into the main executable, and to only invoke env.SConscript() for the directories that the user chooses. Here is one way to implement that:
A/SConstruct:
subdirs = ['B','C','D']
AddOption('--exclude', default=[], action='append', choices=subdirs)
env = Environment(EXCLUDES = GetOption('exclude'))
env.SConscript(
dirs=[subdir for subdir in subdirs
if subdir not in env['EXCLUDES']],
exports='env')
env2 = env.Clone()
env2.PrependUnique(LIBPATH=subdirs,
LIBS=subdirs)
env2.Program('main.c')
B/SConscript:
Import('env')
env.Library('B', env.Glob('*.c'))
C/SConscript:
Import('env')
env.Library('C', env.Glob('*.c'))
D/SConscript:
Import('env')
env.Library('D', env.Glob('*.c'))
To do a global build: scons
To do a build after modifying a single file in D: scons --exclude=B --exclude=C
EDIT
Similarly, you can add a whitelist option to your SConstruct. The idea is the same: only invoke builders for certain objects.
Here is a SConstruct similar to above, but with a whitelist option:
subdirs = ['B','C','D']
AddOption('--only', default=[], action='append', choices=subdirs)
env = Environment(ONLY = GetOption('only') or subdirs)
env.SConscript(
dirs=env['ONLY'],
exports='env')
env2 = env.Clone()
env2.PrependUnique(LIBPATH=subdirs,
LIBS=subdirs)
env2.Program('main.c')
To build everything: scons
To rebuild D and relink main program: scons --only=D
If D is independent of B and C just specify your target in D (program/library), or the whole directory, as target explicitly on the command line like scons D/myprog.exe. SCons will expand the required dependencies automatically, and such doesn't traverse the unrelated folders B and C.
Note how you can specify an arbitrary number of targets, so
scons D/myprog.exe B
is allowed too.

How should I interpolate environment variables in Shake file patterns?

In my Makefiles, I prefer having the output directory defined by a environment variable rather than hard-coded (with some reasonable default value if its unset). For example, a Make rule would look like
$(OUTPUT_DIR)/some_file: deps
#build commands
I have yet to figure out how to achieve a similar goal in Shake. I like using getEnvWithDefault to grab the value of the environment variable or a reasonable default, but no amount of bashing it with binds or lambdas have allowed me to combine it with (*>).
How might it be possible to interpolate an environment variable in a FilePattern for use with (*>)?
The function getEnvWithDefault runs in the Action monad, and the name of the rule has to be supplied in a context where you cannot access the Action monad, so you can't translate this pattern the way you tried. There are a few alternatives:
Option 1: Use lookupEnv before calling shake
To exactly match the behaviour of Make you can write:
main = do
outputDir <- fromMaybe "output" <$> lookupEnv "OUTPUT_DIR"
shakeArgs shakeOptions $ do
(outputDir </> "some_file") *> \out -> do
need deps
-- build commands
Here we use the lookupEnv function (from System.Environment) to grab the environment variable before we start running Shake. We can then define a file that precisely matches the environment variable.
Option 2: Don't force the output in the rule
Alternatively, we can define a rule that builds some_file regardless of what directory it is in, and then use the tracked getEnvWithDefault when we say which file we want to build:
main = shakeArgs shakeOptions $ do
"//some_file" *> \out -> do
need deps
-- build commands
action $ do
out <- getEnvWithDefault "OUTPUT_DIR"
need [out </> "some_file"]
Here the rule pattern can build anything, and the caller picks what the output should be. I prefer this variant, but there is a small risk that if the some_file pattern overlaps in some way you might get name clashes. Introducing a unique name, so all outputs are named something like $OUTPUT_DIR/my_outputs/some_file eliminates that risk, but is usually unnecessary.

Resources