Installing the latest version on on OS X is not the most repeatable process, and the version that comes with most linux distributions is woefully out of date. To work around this, for files that I want to preserve completely, I just tarball them up and add that file to the git-annex. One caveat to this system is that using git and git-annex means that certain file attributes, like permissions and create/modify/access time are not preserved. More importantly, I can just grab what I want from there, because git-annex knows how to just grab the contents of a single file. I can access files from anywhere because my home backup server is available as an ssh remote. With this in place, it’s easy to know where to put new data since everything is just directories in a git repo. Here is the simple workflow that I go through when changing data in any git-annex managed repository: Every time I add data or move things around, all I need to do is run git annex sync to synchronize the tracking data. I can move the symlinks around, even without having the actual data on my system, and when I commit, git-annex will update its tracking information accordingly. There are other directories, and these directories may change over time as I add more data. systems - Archives of files from systems I no longer access.software - Downloaded software for which I’ve purchased licenses.projects - Archives of inactive projects.media - Personal media archives, currently mostly tarballs of pictures going back ten years.funny - Humorous files that I want to keep a copy of (as opposed to trusting the Internet).VMs - VM images that I don’t want to (or can’t) recreate.Here is a sampling of the directories in my repository: This has allowed me to start organizing my backup files in a simple directory structure. To add a new one, all I need to do is clone an existing repository and run git annex init in that repository to register it in the system. My main repository is on a machine at home (which started life as a mini thumper and is now an Ubuntu box), and there are clones of that repository on various remote machines. What I have is a set of git repositories that are linked like this: There is (much) more info in the walkthrough on the git-annex site. This information can be queried to figure out where a files content is and to limit the data manipulation commands. Every time file content is moved, git-annex updates the location information. git-annex keeps track of which repositories contain each file (in a separate git branch that it maintains) and provides commands to move file data around. ) okīy using git-annex, every clone doesn’t have to have the data for every file. Get (merging origin/git-annex into git-annex. & git clone repo other & cd otherīig.tar.gz: broken symbolic link to. To illustrate, here’s how to get from nothing to tracking a file with git-annex: git/annex directory (named after a checksum of the file’s contents). git-annex does this by replacing each file with a symlink that points to the real content in the. It’s an extension to git that allows managing files with git without actually checking them in. It seemed like an interesting extension but I didn’t take another look at it until the creator started a kickstarter project to extend it into a dropbox replacement. I initially heard of git-annex a while ago, when I was perusing the git wiki. I really needed a way of organizing the data and getting it somewhere that I can trust. This also means that most of my hard drives are at 90% capacity and I don’t know what I can safely delete. The actual amount of data that should be backed up is probably less than half of the amount of data that exists on the various internal and external drives both at home and at work. However, since the data is not organized, I can’t tell how much of it can simply be deleted instead of backed up again. When I encounter data that I want to keep, I usually rsync it onto one or another external drive or server. I also have a decent set of digital artifacts (pictures, videos and documents) that I’d rather not lose. I have archives of older systems, some for nostalgic reasons, some for reference. I use time machine to back up my macs, but that only covers the systems that I currently run.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |