Apply file structure diff/patch on remote system? - filesystems

Is there a tool that creates a diff of a file structure, perhaps based on an MD5 manifest. My goal is to send a package across the wire that contains new/updated files and a list of files to remove. It needs to copy over new/updated files and remove files that have been deleted on the source file structure?

You might try rsync. Depending on your needs, the command might be as simple as this:
rsync -az --del /path/to/master dup-site:/path/to/duplicate
Quoting from rsync's web site:
rsync is an open source utility that
provides fast incremental file
transfer. rsync is freely available
under the GNU General Public License
and is currently being maintained by
Wayne Davison.
Or, if you prefer wikipedia:
rsync is a software application for
Unix systems which synchronizes files
and directories from one location to
another while minimizing data transfer
using delta encoding when appropriate.
An important feature of rsync not
found in most similar
programs/protocols is that the
mirroring takes place with only one
transmission in each direction. rsync
can copy or display directory contents
and copy files, optionally using
compression and recursion.

#vfilby I'm the process of implementing something similar.
I've been using rsync for a while, but it gets funky when deploying to remote server with permission changes that are out of my control. With rsync you can choose to not include permissions, but they still endup being considered for some reason.
I'm now using git diff. This works very well for text files. Diff generates patches, rather then a MANIFEST that you have to include with your files. The nice thing about patches is that there is already an established framework for using and testing these patches before they're applied.
For example, with patch utility that comes standard on any *unix box, you can run the patch in dry-run mode. This will tell you if the patch that you're going to apply is actually going to apply before you run it. This helps you to make sure that the files that you're updating have not changed while you were preparing the patch.
If this is similar to what you're looking for, I can elaborate on my process.

Related

Real remote editing without X-Forwarding, using Vim or the like

I'm currently working an a rather large web project which is written using C servlets ( utilizing GWAN Web server ). In the past I've used a couple of IDEs for my LAMP/PHP jobs, like Eclipse.
My problems with Eclipse are that you can either mirror the project locally, which isn't possible in this case as I'm working on a Mac (server does not run on OSX), or use the "remote" view, which would re-upload files when you save them.
In the later case, the file is only partly written while uploading, which makes this a no-go for a running web server, or the file could become corrupted if the connection was lost during uploading. Also, for changing some character, uploading the whole file seems rather inefficient to me.
So I was thinking:
Wouldn't it be possible to have the IDE open Vim per SSH and mirror my changes there, and then just :w (save) ? Or use some kind of diff-files for changes?
The first one would be preffered, as it has the added advantage of Vim .swp files, which makes it possible that others know when someone is already editing the file.
My current solution is using ssh+vim, but then I lose all the cool features I have with Eclipse and other more advanced IDEs.
Also, regarding X-Forwarding: The reason I don't like it is speed. It feels way slower than just editing locally, and takes up unneeded bandwidth, when all I want to do is basically "text editing".
P.S.: I couldn't find any more appropriate tags for the question, especially no "remote" tag, but if you know any, feel free to add them. Also, if there is another similar question, feel free to point it out - I couldn't find any.
Thank you very much.
If you're concerned about having to transmit the entire file for minor changes, the only solution that comes to my mind is running (either continuously, or on demand) an rsync job that mirrors the remote site to your local system (and back). The rsync protocol just transmits the delta information. According to Are rsync operations atomic at file level?, the change is atomic.
Another possibility: run everything in a virtual machine on your Mac. The server and the IDE/text editor are both on the same virtual machine so you don't have to fear network issues.
Because the source code on the virtual machine is under some kind of VCS the classic code → test → commit process is trivial (at least theoretically).

What is a good pattern to synchronize files between computers in parallel (in CentOS)?

Trying to find a good way to copy code between one "deployment" computer and several "target" computers, hopefully in parallel. The idea is that the deployment computer holds a copy of the files as they are supposed to be copied to the target servers. We would like to have copying happen in parallel, as it might involve several tens of target servers.
Our current scheme involves using rsync to synchronize the containing directory where the files reside, in order to keep the target servers up-to-date on the deployment server.
So, the questions are:
What is a good / better way to do this?
What sort of tools are used to do this?
Should this problem be faced from a different angle or perspective that I'm totally missing?
Thanks very much!
Another option is pdsh, a parallel, distributed shell. It's available from EPEL, and allows running remote commands (via ssh) on multiple nodes in parallel. For example:
pdsh -w node10,node11,node12 command
Runs "command" on all three nodes in parallel. It also has a handy hostname expression feature to do the same thing with a bit less typing:
pdsh -w node[10-12] command
It also includes the pdcp command copies files to multiple nodes in parallel. (The pdsh package needs to be installed on all nodes for pdcp to work.)
pdcp -w node[10-12] /local/file /remote/dir/
The local file is copied to the /remote/dir on all three nodes.
We use the lftp command to sync our remote web server to our local backup machine. We wrote a BaSH script to automatically sync all backups on the server to the local box, and we set that script up on a cron to run nightly.
rsync is a fine way of handling this, and I might recommend moving your current protocol into a cron setup if it isn't already.
Unison is also a tool available for setting up two way sync, if you requie that functionality.
Hope this helps!
There is a program called clusterssh that is available on debian based operating systems (but I was able to install it onto RHEL 6.3 using an RPM and resolving other dependencies) that will allow you to open an ssh terminal for multiple machines, with a single input location (this allows you type once onto as many machines as you have terminals open). Then you just have to use a simple scp. I have used this program to move a file from a development workstation to as many as 25 other workstations at the same time, but this option is only really useful if you're trying to accomplish what you stated in the question, that is, copying files from one computer to several others.
This is not an effective syncing mechanism. If you really want it to sync then the above answer would be best.

openldap data files, what do they look like

from my slapd.conf file, i see where my data is stored. when I look into that data directory i see two kinds of files, one type are .bdb files which appear to be the data files as that is the extension defined in the config file. But, I also have a bunch of log files, which appear to be binary when I try to read them in vi. I'm not sure if they are supposed to be there or if this is an oversight by someone previous to me. If I want to restore from an .ldif file, am I loosing anything by deleting all the log files? do I just need to delete the bdb files?
They are Berkeley DB files.
On Ubuntu 10.04, for example, you can install the db4.7-util package and get some information using the various db4.7_* utils (e.g. db4.7_dump or db4.7_stat). This being said, the structure of the database really depends on how OpenLDAP is coded (it's an internal format, so it's not particularly useful unless you really want to dig into it).
If you want to restore from an LDIF file, use LDAP clients or OpenLDAP commands such as ldapadd.

Two way sync with rsync

I have a folder a/ and a remote folder A/.
I now run something like this on a Makefile:
get-music:
rsync -avzru server:/media/10001/music/ /media/Incoming/music/
put-music:
rsync -avzru /media/Incoming/music/ server:/media/10001/music/
sync-music: get-music put-music
when I make sync-music, it first gets all the diffs from server to local and then the opposite, sending all the diffs from local to server.
This works very well only if there are just updates or new files on the future. If there are deletions, it doesn't do anything.
In rsync there is --delete and --delete-after options to help accomplish what I want but thing is, it doesn't work on a 2-way-sync.
If I want to delete server files on a syn, when local files have been deleted, it works, but if, for some reason (explained after) I have some files that aren't in the server but exist locally and they were deleted, I want locally to remove them and not server copied (as it happens).
Thing is I have 3 machines in context:
desktop
notebook
home-server
So, sometimes, server will have files that were deleted with a notebook sync, for example and then, when I run a sync with my desktop (where the deleted server files still exist on) I want these files to be deleted and not to be copied again to the server.
I guess this is only possible with a database and track of operations :P
Any simpler solutions?
Thank you.
Try Unison: http://www.cis.upenn.edu/~bcpierce/unison/
Syntax:
unison dirA/ dirB/
Unison asks what to do when files are different, but you can automate the process by using the following which accepts default (nonconflicting) options:
unison -auto dirA/ dirB/
unison -batch dirA/ dirB/ asks no questions at all, and writes to output how many files were ignored (because they conflicted).
Note: I am no longer using Unison (I use NextCloud, which doesn't address the original use case). However, note that rsync is not designed for bidirectional sync, while unison is. unison may have its bugs (as any other piece of software) and its wrinkles. I am surprised it seems to be actively maintained now (last time I looked I think I thought it looked dead), but I'm not sure what's the state nowadays. I haven't had the need to have a two-way file synchronizer, so there may be better options, though.
Since the original question also involves a desktop and laptop and example involving music files (hence he's probably using a GUI), I'd also mention one of the best bi-directional, multi-platform, free and open source programs to date: FreeFileSync.
It's GUI based, very fast and intuitive, comes with filtering and many other options, including the ability to remote connect, to view and interactively manage "collisions" (in example, files with similar timestamps) and to switch between bidirectional transfer, mirroring and so on.
FreeFileSync can easily sync two computers on the same network and also sync two computers on different and remote networks.
On same network: have FreeFileSync use the local file system on one side and a shared network drive / path on the other. On Windows systems you enable file / disk sharing on one computer and access that share from the other. I use FreeFileSync this way to keep my main development PC source code synced with my 2 laptops.
I have also synced one of these laptops with a Linux server with Samba installed and sharing one of its directories.
Across networks: create a VPN and do the same as above. FreeFileSync will see the remote disk as it was on the local network. Or buy one router that allows you to connect a USB disk to it and share over the internet. I have installed a VPN on a remote Linux server and used it through the OpenVPN Windows client.
You could also try bitpocket: https://github.com/sickill/bitpocket
Try this,
get-music:
rsync -avzru --delete-excluded server:/media/10001/music/ /media/Incoming/music/
put-music:
rsync -avzru --delete-excluded /media/Incoming/music/ server:/media/10001/music/
sync-music: get-music put-music
I just test this and it worked for me. I'm doing a 2-way sync between Windows7 (using cygwin with the rsync package installed) and FreeNAS fileserver (FreeNAS runs on FreeBSD with rsync package pre-installed).
You might use Osync: http://www.netpower.fr/osync , which is rsync based with intelligent deletion propagation. it has also multiple options like resuming a halted execution, soft deletion, and time control.
You could try csync, it is the sync engine under the hood of owncloud.
I'm surprised no one has mentioned Syncthing yet. I have been using it for years to synchronize my phone, my tablet and my two laptops. One time I also used it to send 10 GB of photos to my family ~600 km away, straight from my machine to their machine, and it was incredibly fast (despite the data getting routed through Syncthing's discovery server to work around NAT issues). I also tried OwnCloud/NextCloud at some point but Syncthing has been much more reliable and, also, much faster.
I'm now using SparkleShare https://www.sparkleshare.org/
works on mac, linux and windows.
I'm not sure whether it works with two syncing but for the --delete to work you also need to add the --recursive parameter as well.
Rclone is what you are looking for. Rclone ("rsync for cloud storage") is a command line program to sync files and directories to and from different cloud storage providers including local filesystems. Rclone was previously known as Swiftsync and has been available since 2013.

Configuration Management for FPGA Designs

Which configuration management tool is the best for FPGA designs, specifically Xilinx FPGA's programmed with VHDL and C for the embedded (microblaze) software?
There isn't a "best", but configuration control solutions that work for software will be OK for FPGAs - the flow is very similar. I use Subversion at work and git at home, and wrote a little on 'why' at my blog.
In other answers, binary files keep getting mentioned - the only binary files I deal with are compilation products (equivalent to software object and executables), so I don't keep them in the version control repository, I keep a zipfile for each release/tag that I create with all the important (and irritatingly slow to reproduce) ones in.
I don't think it much matters what revision control tool you use -- anything that you would consider good in general will probably be OK here. I personally use Git for a sizable Verilog + software project, and I'm quite happy with it.
What will bite you in the ass -- no matter what version control you use -- is this: The Xilinx tools don't generally respect a clean division between "input" and "output" or between (human edited) "source" and (opaque) "binary." Many of the tools like to store some state information, like a last-run time or a hash value, in their "input" files meaning that you'll get lots of false changes. Coregen does this to its .xco files, and project navigator (the main GUI) does this to its .xise files. Also, both tools have a habit of inserting or removing lines for default-valued parameters, seemingly at random.
The biggest issue I've encountered is the work-flow with Coregen: In many cases, at least one of the following is true:
You have to manually edit the HDL files produced by Coregen.
The parameters that went into Coregen are stored somewhere other than the .xco file (usually in what looks like an output file).
You have to copy-and-paste the output from Coregen into your top-level design.
This means that there is no single logical source/master location for your input to the core-generating process. So even if you have the .xco file under version control, there's no expectation that the design you're running corresponds to it. If you re-generate "the same" core from its nominal inputs, you probably won't get the right outputs. And don't even think about merging.
I suggest CM tools that support version labeling and binary files. Most Software CM applications are fine with ASCII text files. They may just store a "difference" file rather than the entire file for updates.
My recommendations: PVCS, ClearCase and Subversion. DO NOT USE Microsoft SourceSafe. I don't like it because it only supports one label per revision.
I've seen Perforce and Subversion used in a couple of FPGA-intensive companies.
We use Perforce, and its great. You can have your code that lives in Linux-land checked in side-by-side with your Specs and Docs that live in Windows-land. And you get branching, labels, etc.
I've seen everything from Clearcase to RCS used, and it is really all okay for this kind of thing. The important thing is to get a good set of check-in policies established for your group, and make sure they stick to it.
And have automated nightly regressions. That way, when someone breaks the rules, they can be identified and publicly shamed.
I have personally used Perforce, Subverion, git and ClearCase for FPGA projects. Since VHDL and C are just text files, any works fine. However be sure to capture the other project and contraint files and any libraries you use.
Also think about what to do with the outputs, e.g. log file and bitstreams. Both tend to be big and the bitstreams are binaries.
Previously I used Subversion but have switched to git two years ago. Git handles FPGA design files just as well as it handles every other text and binary file. Git is all you need for version controlling your files and artifacts.
For building the designs, I recommend just using a single ISE project called "ise" (living in a subdirectory called "ise/"). You can take a look at my (very modest) FPGA open-source project on github for the file layout. I don't bother storing the ISE files at all since they are easy to regenerate. The only things I save are the Verilog files and some ISIM waveform config files. In other projects that use coregen I save the coregen.cgp project file and all of the *.xco scripts for regenerating cores. Then I use a Makefile for actually running coregen on the *.xco files. There are a few other Xilinx-specific files you should version control too: *.ucf, *.coe, *.xcf, etc.
I experimented with using Makefiles and the Xilinx command-line tools but found that ISE did a much better job tracking dependencies and calling the tools with the right arguments. Just don't make the mistake of trying to version control your ise/ project files or you will go mad. Xilinx has something like 300 different file types which change every release. If you want to save a file, you can try the ISE project file itself with a .xise extension. Anything that is hard to recreate, like the golden bitfile that you know works and took 6 hours to build, you might want to copy that and configuration manage it explicitly.

Resources