I have the following git action which allows me to download an image.
I have to make sure if the file already exists to skip the "Commit file" and the "Push changes"
How can I check if the file already exists if it already exists nothing is done.
on:
workflow_dispatch:
name: Scrape File
jobs:
build:
name: Build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout#v2
name: Check out current commit
- name: Url
run: |
URL=$(node ./action.js)
echo $URL
echo "URL=$URL" >> $GITHUB_ENV
- uses: suisei-cn/actions-download-file#v1
id: downloadfile
name: Download the file
with:
url: ${{ env.URL }}
target: assets/
- run: ls -l 'assets/'
- name: Commit files
run: |
git config --local user.email "41898282+github-actions[bot]#users.noreply.github.com"
git config --local user.name "github-actions[bot]"
git add .
git commit -m "Add changes" -a
- name: Push changes
uses: ad-m/github-push-action#master
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
branch: ${{ github.ref }}
There are a few options here - you can go directly with bash and do something like this:
if test -f "$FILE"; then
#file exists
fi
or use one of the existing actions like this:
- name: Check file existence
id: check_files
uses: andstor/file-existence-action#v1
with:
files: "assets/${{ env.URL }}"
- name: File exists
if: steps.check_files.outputs.files_exists == 'true'
run: echo "It exists !"
WARNING: Linux (and maybe MacOS) only solution ahead!
I was dealing with a very similar situation some time earlier and developed a method to not just check for added files, but also will be useful if you wanted to check for modified or deleted files or directories as well.
Warning:
This solution works only if the file is added/modified/deleted in git repository.
Introduction:
The command git status --short will return list of untracked, , deleted and modified files. For example:-
D deleted_foo
M modified_foo
?? untracked_dir_foo/
?? untracked_file_foo
A tracked_n_added_foo
Note that we run the same command as git status -s.
Understanding `git status -s` output:
When you read the output, you will see some lines in this form:
** filename
** dirname/
Note that here ** represent the first word of the line (ones like D, ?? etc.).
Here is a summary of all ** in the lines:
**
Meaning
D
File/dir has been deleted.
M
File/dir has been modified.
??
File/dir has been added but not tracked using git add [FILENAME].
A
File/dir has been added and also tracked using git add [FILENAME].
NOTE: Take care of the spaces! Using, for example, M instead of M in the following solution will not work as expected!
Solution:
Shell part of solution:
We can grep the output of git status -s to check whether a file/dir was added/modified/deleted.
The shell part of the solution goes like this:
if git status -s | grep -x "** [FILENAME]"; then
# Do whatever you wanna on match
else
# Do whatever you wanna on no-match
fi
Note: Get desired ** from the table above and replace [FILENAME] with filename.
For example, to check whether a file named foo was modified, use:
git status -s | grep -x " M foo"
Explanation: We use git status -s to get the output and pipe the output to grep. We also use command line option -x with grep so as to match whole line.
Workflow part of solution:
A very simple solution will go like this:
...
- name: Check for file
id: file_check
run: |
if git status -s | grep -x "** [FILENAME]"; then
echo "check_result=true" >> $GITHUB_OUTPUT
else
echo "check_result=false" >> $GITHUB_OUTPUT
fi
...
- name: Run dependent step
if: steps.file_check.outputs.check_result == 'true'
run: |
# Do whatever you wanna do on file found to be
# added/modified/deleted, based on what you set '**' to
...
I want to move the file index.js from the root of the project to the dist/project_name. This is the step from cloudbuild.yaml:
- name: 'gcr.io/cloud-builders/docker'
entrypoint: /bin/bash
args: ['-c', 'mv', 'index.js', 'dist/project_name']
But the step is failing with the next error:
Already have image (with digest): gcr.io/cloud-builders/docker
mv: missing file operand
Try 'mv --help' for more information.
How I can fix this issue?
Because you're using bash -c, I think you need to encapsulate the entire "script" in a string:
args: ['-c', 'mv index.js dist/project_name']
My personal preference (and it's just that), is to not embed JSON ([...]) in YAML. This makes the result in this case slightly clearer and makes it easier to embed a multiline script:
args:
- bash
- -c
- |
mv index js dist/project_name
NOTE tools like YAMLlint will do this for you too.
[Updated1] I have a shell which will change TCP kernel parameters in some functions, but now I need to make this shell run in Docker container, that means, the shell need to know it is running inside a container and stop configuring the kernel.
Now I'm not sure how to achieve that, here is the contents of /proc/self/cgroup inside the container:
9:hugetlb:/
8:perf_event:/
7:blkio:/
6:freezer:/
5:devices:/
4:memory:/
3:cpuacct:/
2:cpu:/docker/25ef774c390558ad8c4e9a8590b6a1956231aae404d6a7aba4dde320ff569b8b
1:cpuset:/
Any flags above can I use to figure out if this process is running inside a container?
[Updated2]: I have also noticed Determining if a process runs inside lxc/Docker, but it seems not working in this case, the content in /proc/1/cgroup of my container is:
8:perf_event:/
7:blkio:/
6:freezer:/
5:devices:/
4:memory:/
3:cpuacct:/
2:cpu:/docker/25ef774c390558ad8c4e9a8590b6a1956231aae404d6a7aba4dde320ff569b8b
1:cpuset:/
No /lxc/containerid
Docker creates .dockerenv and .dockerinit (removed in v1.11) files at the top of the container's directory tree so you might want to check if those exist.
Something like this should work.
#!/bin/bash
if [ -f /.dockerenv ]; then
echo "I'm inside matrix ;(";
else
echo "I'm living in real world!";
fi
To check inside a Docker container if you are inside a Docker container or not can be done via /proc/1/cgroup. As this post suggests you can to the following:
Outside a docker container all entries in /proc/1/cgroup end on / as you can see here:
vagrant#ubuntu-13:~$ cat /proc/1/cgroup
11:name=systemd:/
10:hugetlb:/
9:perf_event:/
8:blkio:/
7:freezer:/
6:devices:/
5:memory:/
4:cpuacct:/
3:cpu:/
2:cpuset:/
Inside a Docker container some of the control groups will belong to Docker (or LXC):
vagrant#ubuntu-13:~$ docker run busybox cat /proc/1/cgroup
11:name=systemd:/
10:hugetlb:/
9:perf_event:/
8:blkio:/
7:freezer:/
6:devices:/docker/3601745b3bd54d9780436faa5f0e4f72bb46231663bb99a6bb892764917832c2
5:memory:/
4:cpuacct:/
3:cpu:/docker/3601745b3bd54d9780436faa5f0e4f72bb46231663bb99a6bb892764917832c2
2:cpuset:/
We use the proc's sched (/proc/$PID/sched) to extract the PID of the process. The process's PID inside the container will differ then it's PID on the host (a non-container system).
For example, the output of /proc/1/sched on a container
will return:
root#33044d65037c:~# cat /proc/1/sched | head -n 1
bash (5276, #threads: 1)
While on a non-container host:
$ cat /proc/1/sched | head -n 1
init (1, #threads: 1)
This helps to differentiate if you are in a container or not. eg you can do:
if [[ ! $(cat /proc/1/sched | head -n 1 | grep init) ]]; then {
echo in docker
} else {
echo not in docker
} fi
Using Environment Variables
For my money, I prefer to set an environment variable inside the docker image that can then be detected by the application.
For example, this is the start of a demo Dockerfile config:
FROM node:12.20.1 as base
ENV DOCKER_RUNNING=true
RUN yarn install --production
RUN yarn build
The second line sets an envar called DOCKER_RUNNING that is then easy to detect. The issue with this is that in a multi-stage build, you will have to repeat the ENV line every time you FROM off of an external image. For example, you can see that I FROM off of node:12.20.1, which includes a lot of extra stuff (git, for example). Later on in my Dockerfile I then COPY things over to a new image based on node:12.20.1-slim, which is much smaller:
FROM node:12.20.1-slim as server
ENV DOCKER_RUNNING=true
EXPOSE 3000
COPY --from=base /build /build
CMD ["node", "server.js"]
Even though this image target server is in the same Dockerfile, it requires the ENV var to be defined again because it has a different base image.
If you make use of Docker-Compose, you could instead easily define an envar there. For example, your docker-compose.yml file could look like this:
version: "3.8"
services:
nodeserver:
image: michaeloryl/stackdemo
environment:
- NODE_ENV=production
- DOCKER_RUNNING=true
Thomas' solution as code:
running_in_docker() {
(awk -F/ '$2 == "docker"' /proc/self/cgroup | read non_empty_input)
}
Note
The read with a dummy variable is a simple idiom for Does this produce any output?. It's a compact method for turning a possibly verbose grep or awk into a test of a pattern.
Additional note on read
What works for me is to check for the inode number of the '/.'
Inside the docker, its a very high number.
Outside the docker, its a very low number like '2'.
I reckon this approach would also depend on the FileSystem being used.
Example
Inside the docker:
# ls -ali / | sed '2!d' |awk {'print $1'}
1565265
Outside the docker
$ ls -ali / | sed '2!d' |awk {'print $1'}
2
In a script:
#!/bin/bash
INODE_NUM=`ls -ali / | sed '2!d' |awk {'print $1'}`
if [ $INODE_NUM == '2' ];
then
echo "Outside the docker"
else
echo "Inside the docker"
fi
We needed to exclude processes running in containers, but instead of checking for just docker cgroups we decided to compare /proc/<pid>/ns/pid to the init system at /proc/1/ns/pid. Example:
pid=$(ps ax | grep "[r]edis-server \*:6379" | awk '{print $1}')
if [ $(readlink "/proc/$pid/ns/pid") == $(readlink /proc/1/ns/pid) ]; then
echo "pid $pid is the same namespace as init system"
else
echo "pid $pid is in a different namespace as init system"
fi
Or in our case we wanted a one liner that generates an error if the process is NOT in a container
bash -c "test -h /proc/4129/ns/pid && test $(readlink /proc/4129/ns/pid) != $(readlink /proc/1/ns/pid)"
which we can execute from another process and if the exit code is zero then the specified PID is running in a different namespace.
golang code, via the /proc/%s/cgroup to check a process in a docker,include the k8s cluster
func GetContainerID(pid int32) string {
cgroupPath := fmt.Sprintf("/proc/%s/cgroup", strconv.Itoa(int(pid)))
return getContainerID(cgroupPath)
}
func GetImage(containerId string) string {
if containerId == "" {
return ""
}
image, ok := containerImage[containerId]
if ok {
return image
} else {
return ""
}
}
func getContainerID(cgroupPath string) string {
containerID := ""
content, err := ioutil.ReadFile(cgroupPath)
if err != nil {
return containerID
}
lines := strings.Split(string(content), "\n")
for _, line := range lines {
field := strings.Split(line, ":")
if len(field) < 3 {
continue
}
cgroup_path := field[2]
if len(cgroup_path) < 64 {
continue
}
// Non-systemd Docker
//5:net_prio,net_cls:/docker/de630f22746b9c06c412858f26ca286c6cdfed086d3b302998aa403d9dcedc42
//3:net_cls:/kubepods/burstable/pod5f399c1a-f9fc-11e8-bf65-246e9659ebfc/9170559b8aadd07d99978d9460cf8d1c71552f3c64fefc7e9906ab3fb7e18f69
pos := strings.LastIndex(cgroup_path, "/")
if pos > 0 {
id_len := len(cgroup_path) - pos - 1
if id_len == 64 {
//p.InDocker = true
// docker id
containerID = cgroup_path[pos+1 : pos+1+64]
// logs.Debug("pid:%v in docker id:%v", pid, id)
return containerID
}
}
// systemd Docker
//5:net_cls:/system.slice/docker-afd862d2ed48ef5dc0ce8f1863e4475894e331098c9a512789233ca9ca06fc62.scope
docker_str := "docker-"
pos = strings.Index(cgroup_path, docker_str)
if pos > 0 {
pos_scope := strings.Index(cgroup_path, ".scope")
id_len := pos_scope - pos - len(docker_str)
if pos_scope > 0 && id_len == 64 {
containerID = cgroup_path[pos+len(docker_str) : pos+len(docker_str)+64]
return containerID
}
}
}
return containerID
}
Based on Dan Walsh's comment about using SELinux ps -eZ | grep container_t, but without requiring ps to be installed:
$ podman run --rm fedora:31 cat /proc/1/attr/current
system_u:system_r:container_t:s0:c56,c299
$ podman run --rm alpine cat /proc/1/attr/current
system_u:system_r:container_t:s0:c558,c813
$ docker run --rm fedora:31 cat /proc/1/attr/current
system_u:system_r:container_t:s0:c8,c583
$ cat /proc/1/attr/current
system_u:system_r:init_t:s0
This just tells you you're running in a container, but not which runtime.
Didn't check other container runtimes but https://opensource.com/article/18/2/understanding-selinux-labels-container-runtimes provides more info and suggests this is widely used, might also work for rkt and lxc?
What works for me, as long as I know the system programs/scrips will be running on, is confirming if what's running with PID 1 is systemd (or equivalent). If not, that's a container.
And this should be true for any linux container, not only docker.
Had the need for this capability in 2022 on macOS and only the answer by #at0S still works from all the other options.
/proc/1/cgroup only has the root directory in a container unless configured otherwise
/proc/1/sched showed the same in-container process number. The name was different (bash) but that's not very portable.
Environment variables work if you configure your container yourself, but none of the default environment variables helped
I did find an option not listed in the other answers: /proc/1/mounts included an overlay filesystem with "docker" in its path.
My cloudbuild.yaml exists of the following:
- name: 'gcr.io/cloud-builders/gsutil'
args: ['-m', 'cp', '-r', '/workspace/api-testing/target/cucumber-html-reports', 'gs://testing-reports/$BUILD_ID']
- name: 'gcr.io/cloud-builders/gsutil'
args: ['-m', 'rm', '-r', 'gs://studio360-testing-reports/latest']
- name: 'gcr.io/cloud-builders/gsutil'
args: ['-m', 'cp', '-r', '/workspace/api-testing/target/cucumber-html-reports', 'gs://testing-reports/latest']
This way I always have my latest report seperated from the older ones. But can I pass a {date} arg or something into my first line so I can have a visual order of all the older reports?
(Because there is no way to rank the files by last modified in the gcp storage/bucket)
Thanks
Change the first block to this:
- name: 'gcr.io/cloud-builders/gsutil'
args: ['-m', 'cp', '-r', '/workspace/api-testing/target/cucumber-html-reports', 'gs://testing-reports/${_DATE}_$BUILD_ID']
Then run this:
gcloud builds submit . --substitutions _DATE=$(date +%F_%H:%M:%S)
Then you would have something like this in the bucket:
gs://testing-reports/2020-02-13_14:01:40_8a6a7ed0-62e0-43ed-8f97-aa6eca9c2834
Explanation here and here.
EDIT:
For automatic builds started by Cloud Build triggers, use this cloudbuild.yaml:
steps:
- name: 'gcr.io/cloud-builders/gsutil'
entrypoint: 'bash'
args:
- '-c'
- |
gsutil -m cp -r $FILENAME gs://$BUCKET/$FILENAME-$(date +%F_%H:%M:%S)-$BUILD_ID
This allows the builder to use bash to execute gsutil, so the bash command "date" can be used inside the command.
Good explanation of the syntax by Googler here, and info about entrypoint here.
Pretty sure you should be able to bash out and do something like this:
- name: 'gcr.io/cloud-builders/gsutil'
entrypoint: 'bash'
args:
- -c
- |
gsutil -m cp -r /workspace/api-testing/target/cucumber-html-reports gs://testing-reports/$BUILD_ID-$(date +%m-%d-%Y)
To my knowledge, you can't run system commands in sub. variables or env. variables. (or at least I haven't been able to figure out how)
I have two tasks in task group
1) a db task to bring up a db and
2) the app that needs the db to be up.
Both start in parallel and the db tasks takes a lil bit time but by then the app recognizes that db is not up and kills the db task. Any solutions? Please advise.
It's somewhat common to have an entrypoint script that checks if the db is healthy. Here's a script i've used before:
#!/bin/sh
set -e
cmd="$*"
postgres_ready() {
if test -z "${NO_DB}"
then
PGPASSWORD="${RDS_PASSWORD}" psql -h "${RDS_HOSTNAME}" -U "${RDS_USERNAME}" -d "${RDS_DB_NAME}" -c '\l'
return $?
else
echo "NO_DB Postgres will pretend to be up"
return 0
fi
}
until postgres_ready
do
>&2 echo "Postgres is unavailable - sleeping"
sleep 1
done
>&2 echo "Postgres is up - continuing..."
exec "${cmd}"
You could save it as entrypoint.sh and run it with your application start script as the argument. eg: entrypoint.sh python main.py