Two way sync with rsync - file

I have a folder a/ and a remote folder A/.
I now run something like this on a Makefile:
get-music:
rsync -avzru server:/media/10001/music/ /media/Incoming/music/
put-music:
rsync -avzru /media/Incoming/music/ server:/media/10001/music/
sync-music: get-music put-music
when I make sync-music, it first gets all the diffs from server to local and then the opposite, sending all the diffs from local to server.
This works very well only if there are just updates or new files on the future. If there are deletions, it doesn't do anything.
In rsync there is --delete and --delete-after options to help accomplish what I want but thing is, it doesn't work on a 2-way-sync.
If I want to delete server files on a syn, when local files have been deleted, it works, but if, for some reason (explained after) I have some files that aren't in the server but exist locally and they were deleted, I want locally to remove them and not server copied (as it happens).
Thing is I have 3 machines in context:
desktop
notebook
home-server
So, sometimes, server will have files that were deleted with a notebook sync, for example and then, when I run a sync with my desktop (where the deleted server files still exist on) I want these files to be deleted and not to be copied again to the server.
I guess this is only possible with a database and track of operations :P
Any simpler solutions?
Thank you.

Try Unison: http://www.cis.upenn.edu/~bcpierce/unison/
Syntax:
unison dirA/ dirB/
Unison asks what to do when files are different, but you can automate the process by using the following which accepts default (nonconflicting) options:
unison -auto dirA/ dirB/
unison -batch dirA/ dirB/ asks no questions at all, and writes to output how many files were ignored (because they conflicted).
Note: I am no longer using Unison (I use NextCloud, which doesn't address the original use case). However, note that rsync is not designed for bidirectional sync, while unison is. unison may have its bugs (as any other piece of software) and its wrinkles. I am surprised it seems to be actively maintained now (last time I looked I think I thought it looked dead), but I'm not sure what's the state nowadays. I haven't had the need to have a two-way file synchronizer, so there may be better options, though.

Since the original question also involves a desktop and laptop and example involving music files (hence he's probably using a GUI), I'd also mention one of the best bi-directional, multi-platform, free and open source programs to date: FreeFileSync.
It's GUI based, very fast and intuitive, comes with filtering and many other options, including the ability to remote connect, to view and interactively manage "collisions" (in example, files with similar timestamps) and to switch between bidirectional transfer, mirroring and so on.
FreeFileSync can easily sync two computers on the same network and also sync two computers on different and remote networks.
On same network: have FreeFileSync use the local file system on one side and a shared network drive / path on the other. On Windows systems you enable file / disk sharing on one computer and access that share from the other. I use FreeFileSync this way to keep my main development PC source code synced with my 2 laptops.
I have also synced one of these laptops with a Linux server with Samba installed and sharing one of its directories.
Across networks: create a VPN and do the same as above. FreeFileSync will see the remote disk as it was on the local network. Or buy one router that allows you to connect a USB disk to it and share over the internet. I have installed a VPN on a remote Linux server and used it through the OpenVPN Windows client.

You could also try bitpocket: https://github.com/sickill/bitpocket

Try this,
get-music:
rsync -avzru --delete-excluded server:/media/10001/music/ /media/Incoming/music/
put-music:
rsync -avzru --delete-excluded /media/Incoming/music/ server:/media/10001/music/
sync-music: get-music put-music
I just test this and it worked for me. I'm doing a 2-way sync between Windows7 (using cygwin with the rsync package installed) and FreeNAS fileserver (FreeNAS runs on FreeBSD with rsync package pre-installed).

You might use Osync: http://www.netpower.fr/osync , which is rsync based with intelligent deletion propagation. it has also multiple options like resuming a halted execution, soft deletion, and time control.

You could try csync, it is the sync engine under the hood of owncloud.

I'm surprised no one has mentioned Syncthing yet. I have been using it for years to synchronize my phone, my tablet and my two laptops. One time I also used it to send 10 GB of photos to my family ~600 km away, straight from my machine to their machine, and it was incredibly fast (despite the data getting routed through Syncthing's discovery server to work around NAT issues). I also tried OwnCloud/NextCloud at some point but Syncthing has been much more reliable and, also, much faster.

I'm now using SparkleShare https://www.sparkleshare.org/
works on mac, linux and windows.

I'm not sure whether it works with two syncing but for the --delete to work you also need to add the --recursive parameter as well.

Rclone is what you are looking for. Rclone ("rsync for cloud storage") is a command line program to sync files and directories to and from different cloud storage providers including local filesystems. Rclone was previously known as Swiftsync and has been available since 2013.

Related

How to (semi-)automatically sync local files with remote devcontainer?

My Goal
I've been using devcontainers in combination with WSL2 for a little while now. But I keep running into issues and besides that I like off-loading resources of my laptop to a server. Moving the containers to a native Linux server would solve my issues.
My ideal situation would be to have a solution that works just like working locally on my Windows laptop (later probably moving to Macbook) but using the facilities of a Linux server (which has systemd and netns) and moving the workload there as well so my laptop doesn't sound like a vacuum cleaner.
My Journey
I'm trying to setup remote containers as described here: https://code.visualstudio.com/remote/advancedcontainers/develop-remote-host
Actually the containers are running fine, I'm using the second storage solution what means I add the following to my .devcontainer.json:
"workspaceMount": "source=/home/marvink/code,target=/workspaces,type=bind,consistency=cached"
And my workflow currently looks something like this:
Clone project locally (with .devcontainer already in the project)
Add workspaceMount above to devcontainer.json
Clone project on remote (e.g. to /home/marvink/code/new-project)
Open project locally
Build and reopen in container
Work on the files on the remote
My issue
This works but now I have files on my local drive that never get touched which isn't ideal but not a disaster, the bigger issue is when I want to update the devcontainer. I need to do that locally (in a seperate window), manually need to copy paste that to the remote if I want to commit it to git and off-course I sometimes forget this and try to edit it remotely which is causing a lot of frustration (and sometimes it seems like it does use the remote config, but that might have been a mistake?).
This is why I want to setup rsync both ways to sync changes to files and as a bonus I can edit files locally when I'm offline. In the link it's described how to do it manually but I want it automated so that I can't forget or make mistakes.
From Powershell I'm able to run an rsync command that syncs one-way and I can extend that to sync 2-way:
wsl rsync -rlptzv --progress --exclude=.git '$PWD' 'marvink#s-dev01:~/code/new-project'
This needs to be ran locally but I can't find a way to do that. I'd need to run a task locally for example, but that isn't possibly when working on a remote (https://github.com/microsoft/vscode-remote-release/issues/168).
The other way around doesn't seem like an option to me as I don't want to expose any ports on my laptop and firewalls would get in the way depending on where I am.
My question
My workflow still seems a bit convoluted so I'm open to suggestions on that end but any ideas on how to sync my workspace files?
You don't need a local version of your code (containing the .devcontainer folder) if you're storing that code on the remote server. You should be able to open an ssh target in VScode using the Remote - SSH extension, which is the recommended approach in the link you added. The extension Remote - Containers 'stacks' on top of the SSH extension, so once connected over SSH you then connect to the container using the .devcontainer.json configuration located on your remote server.
If you don't want to use the extension and use a bind mount + specify docker.host in your settings.json file, you can sync code using the approaches in that same article, through SSHFS, docker machine, or rsync.

Real remote editing without X-Forwarding, using Vim or the like

I'm currently working an a rather large web project which is written using C servlets ( utilizing GWAN Web server ). In the past I've used a couple of IDEs for my LAMP/PHP jobs, like Eclipse.
My problems with Eclipse are that you can either mirror the project locally, which isn't possible in this case as I'm working on a Mac (server does not run on OSX), or use the "remote" view, which would re-upload files when you save them.
In the later case, the file is only partly written while uploading, which makes this a no-go for a running web server, or the file could become corrupted if the connection was lost during uploading. Also, for changing some character, uploading the whole file seems rather inefficient to me.
So I was thinking:
Wouldn't it be possible to have the IDE open Vim per SSH and mirror my changes there, and then just :w (save) ? Or use some kind of diff-files for changes?
The first one would be preffered, as it has the added advantage of Vim .swp files, which makes it possible that others know when someone is already editing the file.
My current solution is using ssh+vim, but then I lose all the cool features I have with Eclipse and other more advanced IDEs.
Also, regarding X-Forwarding: The reason I don't like it is speed. It feels way slower than just editing locally, and takes up unneeded bandwidth, when all I want to do is basically "text editing".
P.S.: I couldn't find any more appropriate tags for the question, especially no "remote" tag, but if you know any, feel free to add them. Also, if there is another similar question, feel free to point it out - I couldn't find any.
Thank you very much.
If you're concerned about having to transmit the entire file for minor changes, the only solution that comes to my mind is running (either continuously, or on demand) an rsync job that mirrors the remote site to your local system (and back). The rsync protocol just transmits the delta information. According to Are rsync operations atomic at file level?, the change is atomic.
Another possibility: run everything in a virtual machine on your Mac. The server and the IDE/text editor are both on the same virtual machine so you don't have to fear network issues.
Because the source code on the virtual machine is under some kind of VCS the classic code → test → commit process is trivial (at least theoretically).

What is a good pattern to synchronize files between computers in parallel (in CentOS)?

Trying to find a good way to copy code between one "deployment" computer and several "target" computers, hopefully in parallel. The idea is that the deployment computer holds a copy of the files as they are supposed to be copied to the target servers. We would like to have copying happen in parallel, as it might involve several tens of target servers.
Our current scheme involves using rsync to synchronize the containing directory where the files reside, in order to keep the target servers up-to-date on the deployment server.
So, the questions are:
What is a good / better way to do this?
What sort of tools are used to do this?
Should this problem be faced from a different angle or perspective that I'm totally missing?
Thanks very much!
Another option is pdsh, a parallel, distributed shell. It's available from EPEL, and allows running remote commands (via ssh) on multiple nodes in parallel. For example:
pdsh -w node10,node11,node12 command
Runs "command" on all three nodes in parallel. It also has a handy hostname expression feature to do the same thing with a bit less typing:
pdsh -w node[10-12] command
It also includes the pdcp command copies files to multiple nodes in parallel. (The pdsh package needs to be installed on all nodes for pdcp to work.)
pdcp -w node[10-12] /local/file /remote/dir/
The local file is copied to the /remote/dir on all three nodes.
We use the lftp command to sync our remote web server to our local backup machine. We wrote a BaSH script to automatically sync all backups on the server to the local box, and we set that script up on a cron to run nightly.
rsync is a fine way of handling this, and I might recommend moving your current protocol into a cron setup if it isn't already.
Unison is also a tool available for setting up two way sync, if you requie that functionality.
Hope this helps!
There is a program called clusterssh that is available on debian based operating systems (but I was able to install it onto RHEL 6.3 using an RPM and resolving other dependencies) that will allow you to open an ssh terminal for multiple machines, with a single input location (this allows you type once onto as many machines as you have terminals open). Then you just have to use a simple scp. I have used this program to move a file from a development workstation to as many as 25 other workstations at the same time, but this option is only really useful if you're trying to accomplish what you stated in the question, that is, copying files from one computer to several others.
This is not an effective syncing mechanism. If you really want it to sync then the above answer would be best.

Work on a remote project with Eclipse via SSH

I have the following boxes:
a) A Windows box with Eclipse CDT,
b) A Linux box, accessible for me only via SSH.
Both the compiler and the hardware required to build and run my project is only on machine B.
I'd like to work "transparently" from a Windows box on that project using Eclipse CDT and be able to build, run and debug the project remotely from within the IDE.
How do I set up that:
The building will work? Any simpler solutions than writing a local makefile which would rsync the project and then call a remote makefile to initiate the actual build? Does Eclipse managed build have a feature for that?
The debugging will work?
Preferably - the Eclipse CDT code indexing will work? Do I have to copy all required header files from machine B to machine A and add them to include path manually?
Try the Remote System Explorer (RSE). It's a set of plug-ins to do exactly what you want.
RSE may already be included in your current Eclipse installation. To check in Eclipse Indigo go to Window > Open Perspective > Other... and choose Remote System Explorer from the Open Perspective dialog to open the RSE perspective.
To create an SSH remote project from the RSE perspective in Eclipse:
Define a new connection and choose SSH Only from the Select Remote System Type screen in the New Connection dialog.
Enter the connection information then choose Finish.
Connect to the new host. (Assumes SSH keys are already setup.)
Once connected, drill down into the host's Sftp Files, choose a folder and select Create Remote Project from the item's context menu. (Wait as the remote project is created.)
If done correctly, there should now be a new remote project accessible from the Project Explorer and other perspectives within eclipse. With the SSH connection set-up correctly passwords can be made an optional part of the normal SSH authentication process. A remote project with Eclipse via SSH is now created.
The very simplest way would be to run Eclipse CDT on the Linux Box and use either X11-Forwarding or remote desktop software such as VNC.
This, of course, is only possible when you Eclipse is present on the Linux box and your network connection to the box is sufficiently fast.
The advantage is that, due to everything being local, you won't have synchronization issues, and you don't get any awkward cross-platform issues.
If you have no eclipse on the box, you could thinking of sharing your linux working directory via SMB (or SSHFS) and access it from your windows machine, but that would require quite some setup.
Both would be better than having two copies, especially when it's cross-platform.
I'm in the same spot myself (or was), FWIW I ended up checking out to a samba share on the Linux host and editing that share locally on the Windows machine with notepad++, then I compiled on the Linux box via PuTTY. (We weren't allowed to update the ten y/o versions of the editors on the Linux host and it didn't have Java, so I gave up on X11 forwarding)
Now... I run modern Linux in a VM on my Windows host, add all the tools I want (e.g. CDT) to the VM and then I checkout and build in a chroot jail that closely resembles the RTE.
It's a clunky solution but I thought I'd throw it in to the mix.
My solution is similar to the SAMBA one except using sshfs. Mount my remote server with sshfs, open my makefile project on the remote machine. Go from there.
It seems I can run a GUI frontend to mercurial this way as well.
Building my remote code is as simple as: ssh address remote_make_command
I am looking for a decent way to debug though. Possibly via gdbserver?
I tried ssh -X but it was unbearably slow.
I also tried RSE, but it didn't even support building the project with a Makefile (I'm being told that this has changed since I posted my answer, but I haven't tried that out)
I read that NX is faster than X11 forwarding, but I couldn't get it to work.
Finally, I found out that my server supports X2Go (the link has install instructions if yours does not). Now I only had to:
download and unpack Eclipse on the server,
install X2Go on my local machine (sudo apt-get install x2goclient on Ubuntu),
configure the connection (host, auto-login with ssh key, choose to run Eclipse).
Everything is just as if I was working on a local machine, including building, debugging, and code indexing. And there are no noticeable lags.
I had the same problem 2 years ago and I solved it in the following way:
1) I build my projects with makefiles, not managed by eclipse
2) I use a SAMBA connection to edit the files inside Eclipse
3) Building the project:
Eclipse calles a "local" make with a makefile which opens a SSH connection
to the Linux Host. On the SSH command line you can give parameters which
are executed on the Linux host. I use for that parameter a makeit.sh shell script
which call the "real" make on the linux host.
The different targets for building you can give also by parameters from
the local makefile --> makeit.sh --> makefile on linux host.
The way I solved that one was:
For windows:
Export the 'workspace' directory from the Linux machine using samba.
Mount it locally in windows.
Run Eclipse, using the mounted 'workspace' directory as the eclipse workspace.
Import the project you want and work on it.
For Linux:
Mount the 'workspace' directory using sshfs
Run Eclipse.
Run Eclipse, using the mounted 'workspace' directory as the eclipse workspace.
Import the project you want and work on it.
In both cases you can either build and run through Eclipse, or build on the remote machine via ssh.
For this case you can use ptp eclipse https://eclipse.org/ptp/ for source browsing and building.
You can use this pluging to debug your application
http://marketplace.eclipse.org/content/direct-remote-c-debugging
How to edit in Eclipse locally, but use a git-based script I wrote (sync_git_repo_from_pc1_to_pc2.sh) to synchronize and build remotely
The script I wrote to do this is sync_git_repo_from_pc1_to_pc2.sh.
Readme: README_git-sync_repo_from_pc1_to_pc2.md
Update: see also this alternative/competitor: GitSync:
How to use Sublime over SSH
https://github.com/jachin/GitSync
This answer currently only applies to using two Linux computers [or maybe works on Mac too?--untested on Mac] (syncing from one to the other) because I wrote this synchronization script in bash. It is simply a wrapper around git, however, so feel free to take it and convert it into a cross-platform Python solution or something if you wish
This doesn't directly answer the OP's question, but it is so close I guarantee it will answer many other peoples' question who land on this page (mine included, actually, as I came here first before writing my own solution), so I'm posting it here anyway.
I want to:
develop code using a powerful IDE like Eclipse on a light-weight Linux computer, then
build that code via ssh on a different, more powerful Linux computer (from the command-line, NOT from inside Eclipse)
Let's call the first computer where I write the code "PC1" (Personal Computer 1), and the 2nd computer where I build the code "PC2". I need a tool to easily synchronize from PC1 to PC2. I tried rsync, but it was insanely slow for large repos and took tons of bandwidth and data.
So, how do I do it? What workflow should I use? If you have this question too, here's the workflow that I decided upon. I wrote a bash script to automate the process by using git to automatically push changes from PC1 to PC2 via a remote repository, such as github. So far it works very well and I'm very pleased with it. It is far far far faster than rsync, more trustworthy in my opinion because each PC maintains a functional git repo, and uses far less bandwidth to do the whole sync, so it's easily doable over a cell phone hot spot without using tons of your data.
Setup:
Install the script on PC1 (this solution assumes ~/bin is in your $PATH):
git clone https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles.git
cd eRCaGuy_dotfiles/useful_scripts
mkdir -p ~/bin
ln -s "${PWD}/sync_git_repo_from_pc1_to_pc2.sh" ~/bin/sync_git_repo_from_pc1_to_pc2
cd ..
cp -i .sync_git_repo ~/.sync_git_repo
Now edit the "~/.sync_git_repo" file you just copied above, and update its parameters to fit your case. Here are the parameters it contains:
# The git repo root directory on PC2 where you are syncing your files TO; this dir must *already exist*
# and you must have *already `git clone`d* a copy of your git repo into it!
# - Do NOT use variables such as `$HOME`. Be explicit instead. This is because the variable expansion will
# happen on the local machine when what we need is the variable expansion from the remote machine. Being
# explicit instead just avoids this problem.
PC2_GIT_REPO_TARGET_DIR="/home/gabriel/dev/eRCaGuy_dotfiles" # explicitly type this out; don't use variables
PC2_SSH_USERNAME="my_username" # explicitly type this out; don't use variables
PC2_SSH_HOST="my_hostname" # explicitly type this out; don't use variables
Git clone your repo you want to sync on both PC1 and PC2.
Ensure your ssh keys are all set up to be able to push and pull to the remote repo from both PC1 and PC2. Here's some helpful links:
https://help.github.com/en/github/authenticating-to-github/connecting-to-github-with-ssh
https://help.github.com/en/github/authenticating-to-github/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent
Ensure your ssh keys are all set up to ssh from PC1 to PC2.
Now cd into any directory within the git repo on PC1, and run:
sync_git_repo_from_pc1_to_pc2
That's it! About 30 seconds later everything will be magically synced from PC1 to PC2, and it will be printing output the whole time to tell you what it's doing and where it's doing it on your disk and on which computer. It's safe too, because it doesn't overwrite or delete anything that is uncommitted. It backs it up first instead! Read more below for how that works.
Here's the process this script uses (ie: what it's actually doing)
From PC1: It checks to see if any uncommitted changes are on PC1. If so, it commits them to a temporary commit on the current branch. It then force pushes them to a remote SYNC branch. Then it uncommits its temporary commit it just did on the local branch, then it puts the local git repo back to exactly how it was by staging any files that were previously staged at the time you called the script. Next, it rsyncs a copy of the script over to PC2, and does an ssh call to tell PC2 to run the script with a special option to just do PC2 stuff.
Here's what PC2 does: it cds into the repo, and checks to see if any local uncommitted changes exist. If so, it creates a new backup branch forked off of the current branch (sample name: my_branch_SYNC_BAK_20200220-0028hrs-15sec <-- notice that's YYYYMMDD-HHMMhrs--SSsec), and commits any uncommitted changes to that branch with a commit message such as DO BACKUP OF ALL UNCOMMITTED CHANGES ON PC2 (TARGET PC/BUILD MACHINE). Now, it checks out the SYNC branch, pulling it from the remote repository if it is not already on the local machine. Then, it fetches the latest changes on the remote repository, and does a hard reset to force the local SYNC repository to match the remote SYNC repository. You might call this a "hard pull". It is safe, however, because we already backed up any uncommitted changes we had locally on PC2, so nothing is lost!
That's it! You now have produced a perfect copy from PC1 to PC2 without even having to ensure clean working directories, as the script handled all of the automatic committing and stuff for you! It is fast and works very well on huge repositories. Now you have an easy mechanism to use any IDE of your choice on one machine while building or testing on another machine, easily, over a wifi hot spot from your cell phone if needed, even if the repository is dozens of gigabytes and you are time and resource-constrained.
Resources:
The whole project: https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles
See tons more links and references in the source code itself within this project.
How to do a "hard pull", as I call it: How do I force "git pull" to overwrite local files?
Related:
git repository sync between computers, when moving around?

If possible how can one embed PostgreSQL?

If it's possible, I'm interested in being able to embed a PostgreSQL database, similar to sqllite. I've read that it's not possible. I'm no database expert though, so I want to hear from you.
Essentially I want PostgreSQL without all the configuration and installation. If it's possible, tell me how.
Run postgresql in a background process.
Start a separate thread in your application that would start a postgresql server in local mode either by binding it to localhost with some random free port or by using sockets (does windows support sockets?). That should be fairly easy, something like:
system("C:\Program Files\MyApplication\pgsql\postgres.exe -D C:\Documents and Settings\User\Local Settings\MyApplication\database -h 127.0.0.1 -p 12345");
and then just connect to 127.0.0.1:12345.
When your application quits, you can always send a SIGTERM to your thread and then wait a few seconds for postgresql to quit (ie join the thread).
PS: You can also use pg_ctl to control your "embedded" database, even without threads, just do a "pg_ctl start" (with appropriate options) when starting the application and "pg_ctl stop" when quitting it.
You cannot embed it, nor should you try.
For embedding you should use sqlite as you mentioned or firebird rdbms.
Unless you do a major rewrite of code, it is not possible to run Postgres "embedded". Either run it as a separate process or use something else. SQLite is an excellent choice. But there are others. MySQL has an embedded version. See it at http://mysql.com/oem/. Also several java choices, and Mac has Core Data you can write too. Hell, you can even use FoxPro. What OS you on and what services you need from the database?
You can't embed it as a in process type thing like sqlite etc, but you can easily embed it into your application setup using Inno setup at http://www.innosetup.org. Search their mailing list archive and you will find someone did most of the work for you and all you have to to is grab the zipped distro and you can easily have postgresql installed when the user installs your app. You can then use the pg_hba.conf file to restrict the server to local host only. Not a true embedded DB, but it would work.
PostgreSQL is intended to run as a stand-alone server; it's probably possible to embed it if you hack at it hard and long enough, but it would be much easier to just run it as intended in a separate process.
HSQLDB (http://hsqldb.org/) is another db which is easily embedded. Requires Java, but is an excellent and often-used choice for Java applications.
Anyone tried on Mac OS X:
http://pagesperso-orange.fr/bruno.gaufier/xhtml/prod_postgresql.xhtml
http://www.macosxguru.net/article.php?story=20041119135924825
(Of course sqlite would be my embedded db of choice as well)
Well, I know this is a very very very old post, but if anyone has nowadays this question, I would refer to:
You can use containers running Postgres. Here's a post that could be helpful, doing something along this line using R:
https://rsangole.netlify.app/post/2021/08/07/docker-based-rstudio-postgres/?utm_source=pocket_mylist
Take a look at duckdb https://duckdb.org/docs/installation/ It is relatively new and still needs to mature. But it works pretty much like an embedded database ("In-process, serverless"), with bindings for several languages (Python, R, Java, ...)

Resources