<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>damontimm.com &#187; tar</title>
	<atom:link href="http://blog.damontimm.com/tag/tar/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.damontimm.com</link>
	<description>Where I go to remember what I did</description>
	<lastBuildDate>Fri, 13 Jan 2012 10:54:16 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Script: clean_bzip &#8211; a command line program for clean directory compression</title>
		<link>http://blog.damontimm.com/python-script-clean-bzip/</link>
		<comments>http://blog.damontimm.com/python-script-clean-bzip/#comments</comments>
		<pubDate>Tue, 30 Mar 2010 14:01:05 +0000</pubDate>
		<dc:creator>Damon</dc:creator>
				<category><![CDATA[scripts]]></category>
		<category><![CDATA[bzip2]]></category>
		<category><![CDATA[gzip]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[tar]]></category>

		<guid isPermaLink="false">http://blog.damontimm.com/?p=237</guid>
		<description><![CDATA[A simple command-line python utility that compresses a directory (or directories) and excludes certain unwanted files. As a self-taught Python user I am still looking for insight on the most pythonic and programmatically-friendly way of accomplishing a given task. In this case, I have written a script that will perform a &#8220;clean bzip2&#8243; of a [...]]]></description>
			<content:encoded><![CDATA[<p>A simple command-line python utility that compresses a directory (or directories) and excludes certain unwanted files.</p>
<p><span id="more-237"></span></p>
<p>As a self-taught Python user I am still looking for insight on the most pythonic and programmatically-friendly way of accomplishing a given task.  In this case, I have written a script that will perform a &#8220;clean bzip2&#8243; of a directory (or directories).  Mac OS X (via AFP and netatalk, in my case) tends leaves a bunch of ugly files/directories hanging around and I would rather not include them in my compressed tar file.</p>
<p>In writing the script, though, I ran into some questions and I am not sure what the recommended approach would be. The script works, as it is, but I feel its a little hacked together and also a little limited in its application.   There is something to be said for programs that &#8220;just work&#8221; (this does) but I want to take it a little further as an educational endeavor and would like it to appear robust, future-thinking, and pythonic.</p>
<p>My initial questions are:</p>
<ol>
<li><del datetime="2010-04-02T10:58:26+00:00">Is there a better way to implement a <code>--quiet</code> flag?</del> [using <code>logging</code>]</li>
<li>I am not very clear on the use of Exceptions (or even if I am using it in a good way here) &#8212; is what I have done the right approach? </li>
<li>Finally, in general: any feedback on how to improve this?  (I am thinking, just now, that the script is only suitable for a command line usage, and couldn&#8217;t be imported by another script, for example.)</li>
</ol>
<p>Any feedback is greatly appreciated.  Writing a script like this is a good learning tool (for me, at least).</p>
<pre class="brush: python;">#! /usr/bin/env python

'''Script to perform a "clean" bzip2 on a directory (or directories).  Removes
extraneous files that are created by Apple/AFP/netatalk before compressing.
'''

import os
import tarfile
import logging
from optparse import OptionParser

# Default files and directories to exclude from the bzip tar
IGNORE_DIRS = ('.AppleDouble',)
IGNORE_FILES = ('.DS_Store',)

class DestinationTarFileExists(Exception):
    '''If the destination tar.bz2 file already exists.'''

def ignore_walk(directory, ignore_dirs=None, ignore_files=None):
    '''Ignore defined files and directories when doing the walk.'''

    # TODO: this does not currently take wild cards into account.  For example,
    # if you wanted to exclude *.pyc files ... should fix that.  Perhaps
    # consider moving this entirely into the below function (or making it more
    # reusable for other apps).
    for dirpath, dirnames, filenames in os.walk(directory):
        if ignore_dirs:
            dirnames[:] = [dn for dn in dirnames if dn not in ignore_dirs]
        if ignore_files:
            filenames[:] = [fn for fn in filenames if fn not in ignore_files]
        yield dirpath, dirnames, filenames

def tar_bzip2_directory(directory, ignore_dirs=IGNORE_DIRS,
                                   ignore_files=IGNORE_FILES ):
    '''Takes a directory and creates a tar.bz2 file (based on the directory
    name).  You can exclude files and sub-directories as desired.'''

    file_name = '-'.join(directory.split(' '))
    tar_name = file_name.replace('/','').lower() + ".tar.bz2"

    if os.path.exists(tar_name):
        msg = ("The file %s already exists. " +
                "Please move or rename it and try again.") % tar_name
        raise DestinationTarFileExists(msg)

    tar = tarfile.open(tar_name, 'w:bz2')

    for dirpath, dirnames, filenames in ignore_walk(directory, ignore_dirs,
            ignore_files):
        for file in filenames:
            logging.info(os.path.join(dirpath, file))
            tar.add(os.path.join(dirpath, file))

    tar.close()

def main(args=None, callback=None):
    directories = []

    for arg in args:
        if os.path.isdir(arg):
            directories.append(arg)
        else:
            logging.ERROR("Ingoring: %s (it's not a directory)." % arg)

    for dir in directories:
        try:
            tar_bzip2_directory(dir)
        except DestinationTarFileExists, e:
            print e

if __name__ == "__main__":

    parser = OptionParser(usage="%prog [options: -q ] [dir1] [...dir2]")
    parser.add_option("-q", "--quiet", action="store_true", dest="quiet")
    options, args = parser.parse_args()

    if options.quiet:
        logging.basicConfig(level=logging.ERROR, format='%(message)s')
    else:
        logging.basicConfig(level=logging.INFO, format='%(message)s')

    main(args)</pre>
]]></content:encoded>
			<wfw:commentRss>http://blog.damontimm.com/python-script-clean-bzip/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

