How to strip exif from image original - hugo

I like to follow the shortcode conventions like described here: https://laurakalbag.com/processing-responsive-images-with-hugo/
and setting exif parameters in config.toml like so
[imaging.exif]
# Regexp matching the fields you want to Exclude from the (massive) set of Exif info
# available. As we cache this info to disk, this is for performance and
# disk space reasons more than anything.
# If you want it all, put ".*" in this config setting.
# Note that if neither this or ExcludeFields is set, Hugo will return a small
# default set.
includeFields = ""
# Regexp matching the Exif fields you want to exclude. This may be easier to use
# than IncludeFields above, depending on what you want.
excludeFields = ".*"
# Hugo extracts the "photo taken" date/time into .Date by default.
# Set this to true to turn it off.
disableDate = true
# Hugo extracts the "photo taken where" (GPS latitude and longitude) into
# .Long and .Lat. Set this to true to turn it off.
disableLatLong = true
However, I noticed that while hugo correctly strip exif from generated scaled images, hugo ALSO places the original image with intact EXIF in the public directory, which presents a security issue.
I'm happy with a solution to not publish the original, or a solution that does publish the original image, but with stripped EXIF.
Thanks for any pointers, I'm sure I'm misunderstanding something fundamental!

A little inelegant, but very safe is to call some kind of preparation before the actual build (to /public). This way you can automatically remove all EXIF information from all photos (e.g. in the blog directory). I use 'jhead' for this purpose. I include this command and other preparations in a script along with the build.
If you don't have the build-process in your own hands, this won't work, of course.
find ./static/images/ -type f | xargs -i jhead -purejpg {}

Related

How can I configure Git to ignore trivial changes (e.g. timestamp) in auto-generated code?

I am working with a tool which auto-generates a large amount of C code. The tool generates code for a batch of .c and .h files at each run. For some reason, the tool isn't smart enough to recognize when the files have no substantial changes, so in many cases it simply updates a timestamp in the comments at the top of each file. Otherwise, the file remains unaltered.
When I run git status in that scenario, I sometimes see dozens or hundreds of files changed. But as I review the changes to the individual files, most of them have no real changes - just an update to the timestamp. I have to go through each file one-by-one to determine if there are any actual changes to be committed.
Is there a way to configure Git so that it can ignore inconsequential changes such as the timestamp in the header comments? Or how might I otherwise deal with this situation?
Thanks for your help.
Is there a way to configure Git so that it can ignore inconsequential changes such as the timestamp in the header comments? Or how might I otherwise deal with this situation?
Yes; this is the purpose of a filter.
You might be familiar with git's notion of "clean" and "smudge" filters already, that's how it handles line ending conversion. When you are on a Windows computer and have Windows-style line endings in your working directory, you might set a .gitattribute like * text=auto indicating that you want files checked into the repository with "normalized" Unix-style line endings. In this case, the files will have the "clean" filter applied to convert \r\n line endings to \n style line endings. Similarly, the files will be "smudged" on checkout to convert from \n to \r\n on-disk.
You can create your own clean and smudge filters to remove (or add) data when translating between the working directory and the repository. For these files you can add an attribute:
*.c filter=autogen
And then you can configure your autogen filter, with commands to run in the "clean" (into the repository) and "smudge" (into the working directory) directions.
git config --global filter.autogen.clean remove_metadata
git config --global filter.autogen.smudge cat
(Using cat is a "noop" as far as filters are concerned).
The Pro Git book has more detailed examples of creating your own filters.
I discovered a way to address the problem of trivial changes using Beyond Compare. I will describe the process as it pertains to ignoring timestamp updates in auto-generated C files, but it can be easily adapted to other situations and languages:
Configure Beyond Compare as the Git difftool. See here for specific details about how to do this.
(Optional but helpful) Add a Git alias for the git difftool --dir-diff --no-symlinks command (for example, dtd).
Make some changes (e.g. auto-generate your files), and run git dtd to do a directory diff. Beyond Compare will open and show you a before/after Folder Comparison of your changes.
Open a Text Compare session window for one of your changed files. Open the Tools menu and select File Formats.
Open the Grammar tab, delete the "Comments" grammar element.
Add a new grammar element and give it a meaningful name such as "Generation Time Comment".
For Category, select the "Delimited" grammar element. In the "Text from" box, enter the text you would like to ignore. For example, if the timestamp in your auto-generated code starts with the string * Generation Time:, enter it into the "Text from" box. Check the "Stop at end of line" checkbox.
Click the "Save" button and go back to your Text Compare session window.
Open the Session menu and select Session Settings. Open the Importance tab.
Look for your new grammar element (e.g. "Generation Time Comment") and uncheck it. This will tell Beyond Compare to treat it as an unimportant change.
Open the Comparison tab, select Rule-Based Comparison.
Change the dropdown at the bottom of the dialog to Update session defaults.
Close Beyond Compare, and then reopen it again by running the git dtd command.
All of the files in the Folder Compare session which contain nothing but an update to the timestamp will be shown with unimportant differences. If you want to completely hide files with unimportant differences, toggle off Ignore Unimportant Differences in the View menu.
Reference: https://www.scootersoftware.com/support.php?zz=kb_unimportantv3

Need script/utility to label MOST, if not all, ClearCase elements for a given path

I found out that labels must be applied starting at the VOB if you want to successfully recreate a specific code (label) release. I thought you wouldn't have to start at the VOB name but you do :-(
My VOB has many programs in it. For example:
VOBname\programs\Java\Program1\files...
VOBname\programs\Java\Program2\files...
VOBname\programs\VB\Program1\files...
VOBname\programs\VB\Program2\files...
What I would like to do is have a script or program that takes two parameters, a path and label, and applies that label to the proper directories and files in that path.
It should not apply the label to other, non related, directories (i.e., if I am labeling Java\Program1 it should not also label Java\Program 2.
I also need the reverse - If someone incorrectly applies the label, then I need to remove the label from the path.
It seems like this feature would have been incorporated into the GUI or a script long ago but I don't see one available. Of course, you can do this manually but this takes longer especially if you have a long path.
I know you can label a directory and all contents underneath that directory but if you start at the VOB, that would label everything (what I don't want).
The simplest solution is to:
apply recursively a label from the path
cd /path
cleartool mklabel -replace -recurse LABEL
for a given path, extract the parent folders, and label those:
avob/
avob/aParentFolder
avob/aParentFolder/aParentSubFolder
Depending on your scripting language, extracting the parent folders can be as easy as perl File::Basename
my($filename, $directories, $suffix) = fileparse($path);
# On Unix returns ("baz", "/foo/bar/", "")
fileparse("/foo/bar/baz");

Is there a quick file open/find like IntelliJ's find file, or Sublime's? Something with fuzzy search. But in Emacs?

I'm looking for something that's a bit robust in how it finds files in Emacs. I have a project made up a number of different files, and a lot of them. So, I think maybe Emacs would need to cache a lookup or something like that to make a quick find/open facility to work. It would need to also be configured per project to consider only some directories and exclude others inside of this project, since a number of files and directories are generated and hold a massive amount of text and sometimes a concatenated representation of the rest of the code.
Is there a quick file open/find like IntelliJ's find file, or Sublime's? Something with fuzzy search. But in Emacs? That could help with this problem?
Projectile can probably do what you're after. It describes itself as a "project interaction library" with facilities for finding project files quickly.
Try projectile: https://github.com/bbatsov/projectile (see its fancy UI, helm-projectile). You'll have the command projectile-find-file. It is based on projects (they are defined by a .git/.gh/… or a .projectile).
permanent caching ? Yes
filter out directories ? Yes (with a command or a config into the .projectile)
fuzzy search ? Yes, a few: emacs'default, ido, ido-fuzzy, grizzl or helm.
you install it simply with M-x package-install RET projectile RET.
See this EmacsWiki page, which is is a jumping-off place for multiple answers to your question.
Emacs has a built-in file-name cache -- see (emacs) File Name Cache and this page.
See also Emacs bookmarks, and in particular, Bookmark+. You can bookmark any file or set of files. You can bookmark a Dired buffer, including its omit set, markings, and included subdirs. You can bookmark a set of such Dired buffers. You can aggregate bookmarks and use them to perform actions that set up environments etc. They can be triggered in various ways. You can bookmark Emacs desktops. You can tag bookmarks and files & dirs with free-form tags, which lets you organize them flexibly into overlapping sets.
See also this page about project support with Icicles.

How can git be configured to ignore files?

There are some files we want ignored, not tracked, by git, and we are having trouble figuring out how to do that.
We have some third-party C library which is unpacked and we have it in Git. But when you configure && make it, it produces many new files. How to write .gitignore to track source files and not the new stuff. (it's not like forbidding *.o)
Edit: There are at least 12 file-types. So we would like NOT to enumerate, which type we want and which not.
Use ! to include all the types of files you need. Something like in the following example"
*
!*.c
!*.h
Explicitly specifying which files should be tracked and ignoring all others might be a solution. * says ignore everything and subsequent lines specify files and directories which should not be ignored. Wildcards are allowed.
*
!filename
!*.extension
!directory/
!/file_in_root_directory
!/directory_in_root_directory
Remember that the order matters. Putting * at the end makes all previous lines ineffective.
Take a look at man gitignore(5) and search for !. It says
Patterns have the following format:
(...)
An optional prefix ! which negates the pattern; any matching file excluded by a previous pattern will become included again. If a negated pattern matches, this will override lower precedence patterns sources.
I'm not sure why you say "it's not like forbidding *.o", but I think you mean that there aren't any good patterns you can identify that apply to the generated files but not to the source files? If it's just a few things that appear (like individual built executables that often don't have any extension on Linux), you can name them explicitly in .gitignore, so they aren't a problem.
If there really are lots and lots of files that get generated by the build process that share extensions and other patterns with the source files, then just use patterns that do include your source files. You can even put * in .gitignore if it's really that bad. This will mean that no new files show up when you type git status, or get added when you use git add ., but it doesn't harm any files that are already added to the repository; git will still tell you about changes to them fine, and pick them up when you use git add .. It just puts a bit more burden on you to explicitly start tracking files that you do care about.
I would make sure the repo is clean (no changes, no untracked files), run configure && make and then put the newly untracked filed into the ignore file. Something like git status --porcelain | fgrep '??' | cut -c4- will pull them out automatically, but it would be worth some eyeball time to make sure that is correct...

Removing Keyword Substitution comments from source files?

Note: For wont of a better word I call the fluff at the start of source files --
/* #(#) $Id: file.c,v 1.9 2011/01/05 11:55:00 user Exp $
**************************************************************************
* COPYRIGHT, 2005-2011 *
...
*/
-- Keyword Substitution comments, although I do not know if this is just a subversion term.
Anyway, now to the question: We have a 3rd party supplier that we get source code from. These c source all have these keyword subst comments, and every time we get a new version from the supplier, all (1000+) files are changed because they update these comments for every release they send us, even if no source code changes whatsover are made in these files, so the only change is the comments. Now, before we compile and use these sources, we would be interested in doing a cursory code review to see the areas that have been changed. (Never trust the release history). However, this is rather difficult, as doing a simple folder diff will obviously list all files.
What I'm looking for now is whether there already exist any simple tools to strip these special multi line comments from the source files. Maybe anyone has a link to a grep or sed script that will scratch that stuff from the files?
Something like:
perl -ne 'if(m+/\*.*\$Id: +) $c = 1; print unless $c; if($c && m+\*/+) $c = 0;'
Note that this will work only if
such comments are delimited with /*...*/
on the first line there is $Id:
there is nothing after the */
there is no */ before the /*
And that it will strip all lines that are between start of comment and end of comment.
I have not tested it!
First, I would try to convince them to review either their version control system (looks as if they use RCS, still?) or if that is not possible to have them hook up to a svn or git server for submitting their changes. But perhaps you already did?
If nothing in that sense is possible, I would try to set up a git repository to hold the versions that they supply to you. Git allows you to have filters when you are importing or exporting and also has support for ignoring such tags for deltas between versions.

Resources