7 Steps to Cleaning Up IMAP Email Inboxes with PHP


references: blog: http://www.wynia.org/

2015-10-04 13:28:19gstlouis

I have a periodic cron job that hits my OPML list, retrieves the contents of all 800+ feeds, and converts any new entries into email messages which go to a separate email address.

In that IMAP inbox, they sit, until a POPFile filter or I take a look at them. The POPFile filter moves them into "interesting" and "uninteresting" folders.

I end up looking at both the INBOX (where unclassified stuff stays) and the "interesting" folder for stuff to read.

I read the email account in Thunderbird and use the color-coding keywords to classify the items. I hit the "1" key and it gets flagged in red as something I want to share on the Saved Feed Items page. I hit the "2" key as orange for stuff I read and am planning on deleting. I don't delete them immediately because the delete is much slower than the tagging. This way, I can pretty much cruise through a folder and process it quickly.

I then go back every once in a while (see the problem on the horizon?) and move the red messages to a folder called "publish" and delete the orange ones. The "publish" folder is what the Saved Items page reads when it displays recent items. Enter the problem. Usually, I hit these folders in the "Unread" view, which only shows me the stuff I haven't yet looked at. That means that there can build up a huge pile of red and orange stuff I've read but not moved. This is exactly what's been happening.

When I initially created that page, I took a cursory look for a way to act on the messages based on their color-coding. I knew that, since the colors persist between my machines on that IMAP account, the flags were getting set on the server. However, doing a dump of the information that PHP had on a message didn't reveal any place that they were being stored and accessible to PHP. So, I just went forward with the manual process.

Early this morning, I finally had it and dug around for the real solution. What I found led to a nice little solution to not only the original problem but mailbox cleanup in general.

PHP has a nice little function called imap_search(). It takes this SQL-like query language for searching for messages. One of the parameters that imap_search() takes in is "KEYWORD", which is the color-coding in question. However, since PHP doesn't seem to actually fetch this header (called X-Keywords by the way), I was facing the prospect of being able to search for them, but not knowing what to search for.

So, I logged into the server and took a look at one of the mbx files in question. That's when I saw the X-Keywords header and the values in it. I switched back to PHP to see if somehow I missed that header in my dump of a test message. Nope. It's not there.

That's still irritating, but no longer a roadblock. After looking in the right mbx file, I saw that the relevant X-Keyword values were: $Label1 for the red messages and $Label2 for the orange.

OK. To clean up completely, what I need is to delete all orange messages wherever I find them and move any red ones to a specific folder. The basic procedure for this and all other cleanup stuff is the same. Once I figured it out, I also added one that archives old items out of the publish folder (thus speeding up the Saved Items page).

Here's the basic procedure:

  1. Open a link to the IMAP account and appropriate mail folder.
    $mbox = imap_open("{mail.example.com:143}INBOX", "email@example.com", "password");
  2. Query the folder for the relevant messages.

    This is the query for "red" messages.

    $messages = imap_search($mbox,"KEYWORD "$Label1"");
    This is the query for messages that are older than 7 days. This one's a bit more elaborate because you need a date formatted correctly.

    $end_time = strtotime("-7 days");
    $end_date = date("D, j M Y",$end_time);
    $messages = imap_search($mbox,"BEFORE "$end_date" SEEN UNDELETED");
  3. Reverse the array (I do this just to get them in a neat order. This is totally optional.

    $messages = array_reverse($messages);
  4. Loop through the messages.

    foreach($messages as $item){
  5. Inside the loop, do your action.

    To delete the message:

    imap_delete ($mbox,$item);

    To move it to another folder (foldername):
  6. Expunge the folder. This is because both moves and deletes are actually just flags on the messages. They aren't really deleted until you expunge.

    imap_expunge ($mbox);
  7. Close the mailbox.



Going through those steps for each of the cleanup options let me completely organize my inbox automatically. I'm going to tie all of these operations to the script that actually does the fetching. Now, it will cleanup old messages and then move on to getting new ones.

A couple of notes.

Not all IMAP servers support the server-side flags. In those cases, Thunderbird fakes it and stores them on the client side. This won't work with those.

Also, while $Label1 and $Label2 are what I used here, it's because that's what Thunderbird uses (and does so because you can change what red means from "Important" to whatever you want). If you use another email client, this wouldn't necessarily be right.

There's more information on the IMAP queries on the imap_search() documentation page.

2015-10-04 13:28:28