ディレクトリ構造を保ったままフィルタ #2

なんか、昨日書いたスクリプトを 10 GBytes くらいのデータの処理に使ってたら途中で止まりやがった。

/home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/thread.rb:203:in `stop': stopping only thread (ThreadError)
        note: use sleep to stop forever from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/thread.rb:203:in `wait'
        from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/thread.rb:155:in `exclusive_unlock'
        from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/thread.rb:33:in `exclusive'
        from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/thread.rb:147:in `exclusive_unlock'
        from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/thread.rb:201:in `wait'
        from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/shell/process-controller.rb:185:in `wait_all_jobs_execution'
        from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/thread.rb:135:in `synchronize'
        from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/shell/process-controller.rb:182:in `wait_all_jobs_execution'
        from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/shell/command-processor.rb:246:in `check_point'
        from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/shell/command-processor.rb:254:in `transact'
        from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/shell/command-processor.rb:519:in `__send__'
        from /home/oda/opt/stow/ruby-1.8.5_amd64/lib/ruby/1.8/shell/command-processor.rb:519:in `transact'
        from ./bin/dir_filter.rb:27:in `main'
        from ./bin/dir_filter.rb:18:in `each'
        from ./bin/dir_filter.rb:18:in `main'
        from ./bin/dir_filter.rb:49

むう、なんだこれは。lock 周りの問題か。追いかけるのも面倒だし、オーバーヘッドも無視できないくらいあるようなので、Python で書き換えた。

#!/usr/bin/env python

import sys
import os
import subprocess

def main(src, dest, debug, commands):
    for src_file, dest_file in list_files(src, dest):
        if debug > 0:
            print >>sys.stderr, 'input_file:  %s' % input_file
            print >>sys.stderr, 'output_file: %s' % output_file

        output_dir = os.path.dirname(dest_file)
        if not os.path.exists(output_dir):
            if debug > 0: print >>sys.stderr, 'make directory: %s' % output_dir
            os.makedirs(output_dir)

        input_fp = file(src_file)
        output_fp = file(dest_file, 'w')
        try:
            process = subprocess.Popen(commands, stdin=input_fp, stdout=output_fp)
            process.wait()
        finally:
            input_fp.close()
            output_fp.close()

def list_files(src, dest):
    for child in os.listdir(src):
        path = os.path.join(src, child)
        if os.path.isfile(path):
            yield path, os.path.join(dest, child)
        elif os.path.isdir(path):
            for entry in list_files(path, os.path.join(dest, child)):
                yield entry

if __name__ == '__main__':
    import optparse
    parser = optparse.OptionParser()
    parser.add_option('-t', '--to', dest='dest')
    parser.add_option('-f', '--from', dest='src')
    parser.add_option('-d', '--debug', dest='debug', action='count')
    (options, args) = parser.parse_args()

    main(options.src, options.dest, options.debug, args)

追記

とりあえず ruby-dev に投げてみた。