May 11, 2013

Custom RecursiveIterator

PHP RecursiveIterator and its relatives are quite handy in any tree-related situation. Probably the most used is RecursiveDirectoryIterator for getting all files recursively in a directory. But this family of iterators has some issues to be addressed so that developers do not end up fleeing. First, you must not fear SPL. But there's more. Let's see…

Usage

Use of recursive iterators is quite simple once you get it: All must be wrapped inside a RecursiveIteratorIterator in order to achieve it's recursive behaviour. So you do:
$iterator = new RecursiveIteratorIterator($recursiveIterator);
foreach ($iterator as $item) {
    echo $item, PHP_EOL;
}
(Yes, that's not a typo, is an iterator of recursive iterators)

So, to iterate recursively over all the files in a directory, we can do:

$recursive = new RecursiveDirectoryIterator('/path', RecursiveDirectoryIterator::SKIP_DOTS);
$iterator = new RecursiveIteratorIterator($recursive);
foreach ($iterator as $file) {
    echo $file, PHP_EOL;
}

RecursiveIteratorIterator offers some modes of operation, namely:
  • LEAVES_ONLY: Iterates over all nodes that have no children.
  • SELF_FIRST: When a node with children is found, process it first, then iterate over it's children.
  • CHILD_FIRST: Iterate first over children, then process the node.
Let's have a look to the execution of the snippet above over all the different modes in a file system like this:
/path
  |- file1.php
  |- dir1
  |   |- file2.php
  |   |- dir2
  |       \- file3.php
  |- dir3
  \- dir4
      \- file4.php

LEAVES_ONLY

/dir1/file2.php
/dir1/dir2/file3.php
/dir4/file4.php
/file1.php

SELF_FIRST

/dir3
/dir1
/dir1/file2.php
/dir1/dir2
/dir1/dir2/file3.php
/dir4
/dir4/file4.php
/file1.php

CHILD_FIRST

/dir3
/dir1/file2.php
/dir1/dir2/file3.php
/dir1/dir2
/dir1
/dir4/file4.php
/dir4
/file1.php
Note that LEAVES_ONLY does not treat dir3 as a leave because, although it has no children, is elegible for inner iteration.

Filtering

Another issue with directory iteration is permission. Sooner or later you will get something like this:
PHP Fatal error:  Uncaught exception 'UnexpectedValueException' with message 'RecursiveDirectoryIterator::__construct(/path): failed to open dir: Permission denied'
If you're a happy exception-deaf coder, there's an undocumented flag for you: RecursiveIteratorIterator::CATCH_GET_CHILD. This flag makes the iterator to skip nodes throwing exceptions, and also its child.
$iterator = new RecursiveIteratorIterator($recursive, RecursiveIteratorIterator::LEAVES_ONLY, RecursiveIteratorIterator::CATCH_GET_CHILD);
But if you're like me, you don't want to flush your exceptions down the bathroom: They're telling you you're doing something wrong. And that's good from a developer's point of view.
So you must inherit from RecursiveDirectoryIterator and add the behaviour, or create a RecursiveFilterIterator.

RecursiveFilterIterator

As I did in this entry, you can create a class for filtering entries
class myRecursiveFilterIterator extends \RecursiveFilterIterator
{
    protected $filter;

    public function __construct(\RecursiveIterator $iterator, \Closure $filter)
    {
        $this->filter = $filter;
        parent::__construct($iterator);
    }

    public function accept()
    {
        return $this->filter->__invoke(parent::current());
    }

    public function getChildren()
    {
        return new self($this->getInnerIterator()->getChildren(), $this->filter);
    }
}

Have fun with SPL and RecursiveIterators!

No comments:

Post a Comment