Mac OS X core utility Find and Extended Attributes

I’ve been writing backup scripts recently. I really like rsync and have been using it with Mac OS X’s smbfs quite successfully (under 10.5). But the other day I found some files that were created on the receiving file system but contained no data.

I really wanted to know how pervasive this was so I asked find to list the files that are zero bytes long. Needless to say I got a lot of false positives. It is not uncommon for the files I use to have zero bytes in the data fork. Usually these files will have some data in their resource fork. The find command only looks for data in the data fork. Files with zero bytes in both forks are rare and those are the files I needed to find.

I needed to change the behavior of find. Finding and compiling the source code for find was easy. My first thought was to add up all of the extended attributes each file has and use that as the total file size:

75a76

> #include <sys/xattr.h>

1428c1429,1453

< off_t size;

---

> off_t size, calc_size;

>    ssize_t list_size;

>    

>    calc_size = entry->fts_statp->st_size;

>    

>    /* get list of extended attribute names */

>    list_size = listxattr(entry->fts_path, NULL, 0, XATTR_NOFOLLOW);

>    

>    if( list_size > 0 )

>    {

>       char *namebuf, *current_name;

>       namebuf = malloc( list_size );

>       listxattr(entry->fts_path, namebuf, list_size, XATTR_NOFOLLOW);

>       current_name = namebuf;

>       

>       /* iterate over list of extended attribute names */

>       

>       while( current_name < namebuf+list_size )

>       {

>          calc_size += getxattr(entry->fts_path, current_name, NULL, 0, 0, XATTR_NOFOLLOW);

>          current_name += strlen(current_name) + 1;

>       }

>      

>       free( namebuf );

>    }

1430,1431c1455

< size = divsize ? (entry->fts_statp->st_size + FIND_SIZE - 1) /

<     FIND_SIZE : entry->fts_statp->st_size;

---

> size = divsize ? (calc_size + FIND_SIZE - 1) / FIND_SIZE : calc_size;

This worked well enough for my needs. Although it has some problems. The biggest one is that it doesn’t produce files sizes that match the Finder’s get info dialog box. It seems the Finder only adds the data fork with the resource fork. Where as the code above adds every available extended attribute. Common additional extended attributes include “com.apple.quarantine” and “com.apple.FinderInfo”.