Class ParallelDelete

java.lang.Object
com.aoapps.hodgepodge.io.ParallelDelete

public final class ParallelDelete extends Object

Our backup directories contain parallel directories with many hard links. The performance of deleting more than one of the directories can be improved by deleting from them in parallel.

Also performs the task with three threads:

     Iterate filesystem -> Delete entries -> Verbose Output
     (Calling Thread)      (New Thread)      (New Thread)
 

Verifying this is, in fact, true. This is measured with a copy of the backups from one of our managed servers. The system RAM was limited to 128 MB to better simulate backup server hardware. ext3 benchmarks on Maxtor 250 GB 7200 RPM SATA. reiserfs benchmarks on WD 80 GB 7200 IDE.

                       +---------------------+---------------------+
                       |         ext3        |      reiserfs       |
 +-----------+---------+----------+----------+----------+----------+
 | # Deleted |         | parallel |  rm -rf  | parallel |  rm -rf  |
 +-----------+---------+----------+----------+----------+----------+
 |      1/13 |         |     TODO |     TODO |     TODO |     TODO |
 |      2/13 |         |     TODO |     TODO |     TODO |     TODO |
 |      3/13 |         |     TODO |     TODO |     TODO |     TODO |
 |      4/13 |         |     TODO |     TODO |     TODO |     TODO |
 |      8/13 |         |     TODO |     TODO |     TODO |     TODO |
 +-----------+---------+----------+----------+----------+----------+
 |           | User    |    61.99 |     2.61 |    63.29 |     3.00 |
 |     13/13 | System  |    89.90 |    48.01 |   180.69 |   113.26 |
 |           | Elapsed | 10:38:53 | 10:23.79 |  8:26.71 | 33:13.52 |
 |           | % CPU   |      23% |       8% |      48% |       5% |
 +-----------+---------+----------+----------+----------+----------+
 

TODO: Once benchmarks finished for other # Deleted, adjust threshold between rm and parallel in FailoverFileReplicationManager

TODO: Should it use a provided ExecutorService instead of making own Threads?

TODO: Concurrent deletes would be possible. Is there any advantage?

Author:
AO Industries, Inc.
  • Method Details

    • main

      public static void main(String[] args)
      Deletes multiple directories in parallel (but not concurrently).
    • parallelDelete

      public static void parallelDelete(List<File> directories, PrintStream verboseOutput, boolean dryRun) throws IOException
      Recursively deletes all of the files in the provided directories. Also deletes the directories themselves. It is assumed the directory contents are not changing, and there are no safe guards to protect against this. This implies that there is a race condition where the delete could possibly follow a symbolic link and delete outside the intended directory trees.
      Throws:
      IOException