Script: clean_bzip – a command line program for clean directory compression
A simple command-line python utility that compresses a directory (or directories) and excludes certain unwanted files.
As a self-taught Python user I am still looking for insight on the most pythonic and programmatically-friendly way of accomplishing a given task. In this case, I have written a script that will perform a “clean bzip2″ of a directory (or directories). Mac OS X (via AFP and netatalk, in my case) tends leaves a bunch of ugly files/directories hanging around and I would rather not include them in my compressed tar file.
In writing the script, though, I ran into some questions and I am not sure what the recommended approach would be. The script works, as it is, but I feel its a little hacked together and also a little limited in its application. There is something to be said for programs that “just work” (this does) but I want to take it a little further as an educational endeavor and would like it to appear robust, future-thinking, and pythonic.
My initial questions are:
Is there a better way to implement a[using--quietflag?logging]- I am not very clear on the use of Exceptions (or even if I am using it in a good way here) — is what I have done the right approach?
- Finally, in general: any feedback on how to improve this? (I am thinking, just now, that the script is only suitable for a command line usage, and couldn’t be imported by another script, for example.)
Any feedback is greatly appreciated. Writing a script like this is a good learning tool (for me, at least).
#! /usr/bin/env python
'''Script to perform a "clean" bzip2 on a directory (or directories). Removes
extraneous files that are created by Apple/AFP/netatalk before compressing.
'''
import os
import tarfile
import logging
from optparse import OptionParser
# Default files and directories to exclude from the bzip tar
IGNORE_DIRS = ('.AppleDouble',)
IGNORE_FILES = ('.DS_Store',)
class DestinationTarFileExists(Exception):
'''If the destination tar.bz2 file already exists.'''
def ignore_walk(directory, ignore_dirs=None, ignore_files=None):
'''Ignore defined files and directories when doing the walk.'''
# TODO: this does not currently take wild cards into account. For example,
# if you wanted to exclude *.pyc files ... should fix that. Perhaps
# consider moving this entirely into the below function (or making it more
# reusable for other apps).
for dirpath, dirnames, filenames in os.walk(directory):
if ignore_dirs:
dirnames[:] = [dn for dn in dirnames if dn not in ignore_dirs]
if ignore_files:
filenames[:] = [fn for fn in filenames if fn not in ignore_files]
yield dirpath, dirnames, filenames
def tar_bzip2_directory(directory, ignore_dirs=IGNORE_DIRS,
ignore_files=IGNORE_FILES ):
'''Takes a directory and creates a tar.bz2 file (based on the directory
name). You can exclude files and sub-directories as desired.'''
file_name = '-'.join(directory.split(' '))
tar_name = file_name.replace('/','').lower() + ".tar.bz2"
if os.path.exists(tar_name):
msg = ("The file %s already exists. " +
"Please move or rename it and try again.") % tar_name
raise DestinationTarFileExists(msg)
tar = tarfile.open(tar_name, 'w:bz2')
for dirpath, dirnames, filenames in ignore_walk(directory, ignore_dirs,
ignore_files):
for file in filenames:
logging.info(os.path.join(dirpath, file))
tar.add(os.path.join(dirpath, file))
tar.close()
def main(args=None, callback=None):
directories = []
for arg in args:
if os.path.isdir(arg):
directories.append(arg)
else:
logging.ERROR("Ingoring: %s (it's not a directory)." % arg)
for dir in directories:
try:
tar_bzip2_directory(dir)
except DestinationTarFileExists, e:
print e
if __name__ == "__main__":
parser = OptionParser(usage="%prog [options: -q ] [dir1] [...dir2]")
parser.add_option("-q", "--quiet", action="store_true", dest="quiet")
options, args = parser.parse_args()
if options.quiet:
logging.basicConfig(level=logging.ERROR, format='%(message)s')
else:
logging.basicConfig(level=logging.INFO, format='%(message)s')
main(args)

Start a new comment thread