#acl PaulHowarth:read,write,admin,revert,delete All:read === Friday 30th March 2007 === ==== Mail Tidying ==== Just deleted around a gigabyte of mail (about 25% of the mail on my IMAP server). This was messages more than 6 months old on a few high-traffic mailing lists I'm on. I was comfortable deleting these messages because they're available on a variety of public archive sites. It was easy to write a script to do this since I have the messages stored in `maildir` format on the server. It's careful to only delete messages containing some list-recognition regex so as to avoid removing personal emails that are stored in the same folder for whatever reason. The script itself (`~/bin/list-cleanse`): {{{ #!/bin/bash # list-cleanse; remove old mailing list messages from folders # Source configuration declare -a LIST_NAME LIST_REGEX LIST_FOLDER LIST_RETENTION source ~/lib/list-cleanse.conf || exit 1 # Generate tempfile TMPFILE1=/tmp/list-cleanse.$(id -u -n).$$.1 TMPFILE2=/tmp/list-cleanse.$(id -u -n).$$.2 trap 'rm -f $TMPFILE1 $TMPFILE2; exit 1' 1 2 15 # Iterate through lists LISTNUM=0 while [ -n "${LIST_NAME[$LISTNUM]}" ]; do echo "list-cleanse: processing ${LIST_NAME[$LISTNUM]}" if [ -z "${LIST_REGEX[$LISTNUM]}" -o -z "${LIST_FOLDER[$LISTNUM]}" -o -z "${LIST_RETENTION[$LISTNUM]}" ]; then echo "list-cleanse: list info incomplete for ${LIST_NAME[$LISTNUM]}" 1>&2 exit 1 fi find ${MAILDIR}/${LIST_FOLDER[$LISTNUM]}/cur -type f > $TMPFILE1 echo "list-cleanse: files in folder: $(wc -l < $TMPFILE1)" rm -f $TMPFILE1 find ${MAILDIR}/${LIST_FOLDER[$LISTNUM]}/cur -type f -mtime +${LIST_RETENTION[$LISTNUM]} > $TMPFILE1 echo "list-cleanse: candidates for deletion: $(wc -l < $TMPFILE1)" xargs --max-args=1 grep --files-with-matches "${LIST_REGEX[$LISTNUM]}" < $TMPFILE1 > $TMPFILE2 echo "list-cleanse: matched candidates: $(wc -l < $TMPFILE2)" xargs rm -f < $TMPFILE2 echo "list-cleanse: folder cleansed" rm -f $TMPFILE1 $TMPFILE2 let LISTNUM+=1 done # Clean up rm -f $TMPFILE1 $TMPFILE2}}} The configuration file (`~/lib/list-cleanse.conf`): {{{ # Folders are relative to here MAILDIR=$HOME/mail/inbox LIST_NAME[0]=fedora-list LIST_REGEX[0]="^List-Id:.*fedora-list\.redhat\.com" LIST_FOLDER[0]=.Linux.fedora-list LIST_RETENTION[0]=180 LIST_NAME[1]=fedora-devel-list LIST_REGEX[1]="^List-Post:.*fedora-devel-list@redhat\.com" LIST_FOLDER[1]=.Linux.fedora-devel-list LIST_RETENTION[1]=180 LIST_NAME[2]=fedora-package-review LIST_REGEX[2]="^List-Id:.*fedora-package-review\.redhat\.com" LIST_FOLDER[2]=.Linux.fedora-package-review LIST_RETENTION[2]=180 LIST_NAME[3]=fedora-extras-commits LIST_REGEX[3]="^List-Id:.*fedora-extras-commits\.redhat\.com" LIST_FOLDER[3]=.Linux.fedora-extras-commits LIST_RETENTION[3]=180 LIST_NAME[4]=fedora-extras-list LIST_REGEX[4]="^List-Id:.*fedora-extras-list\.redhat\.com" LIST_FOLDER[4]=.Linux.fedora-extras-list LIST_RETENTION[4]=180 }}} Result: {{{ [paul@goalkeeper lib]$ du -ks ~/mail/inbox 3687784 /home/paul/mail/inbox [paul@goalkeeper lib]$ list-cleanse list-cleanse: processing fedora-list list-cleanse: files in folder: 10453 list-cleanse: candidates for deletion: 19 list-cleanse: matched candidates: 10 list-cleanse: folder cleansed list-cleanse: processing fedora-devel-list list-cleanse: files in folder: 21690 list-cleanse: candidates for deletion: 13893 list-cleanse: matched candidates: 13883 list-cleanse: folder cleansed list-cleanse: processing fedora-package-review list-cleanse: files in folder: 32573 list-cleanse: candidates for deletion: 14518 list-cleanse: matched candidates: 14517 list-cleanse: folder cleansed list-cleanse: processing fedora-extras-commits list-cleanse: files in folder: 59525 list-cleanse: candidates for deletion: 41129 list-cleanse: matched candidates: 41124 list-cleanse: folder cleansed list-cleanse: processing fedora-extras-list list-cleanse: files in folder: 29354 list-cleanse: candidates for deletion: 25577 list-cleanse: matched candidates: 25493 list-cleanse: folder cleansed $ du -ks ~/mail/inbox 2544208 /home/paul/mail/inbox}}} I already see the benefit of this is faster mail reading times, and of course there will be a benefit next time I do a full backup of the server. ----