Just when you badly need that specific email message received three or four years ago, your email service goes down. It may happen to any hosting company, although it shouldn’t be a frequent event. Or perhaps you have decided to move to another hosting service, and you must get a copy of every message stored in the old system. Either way, the time comes when you want a backup of all your email messages. Here is how to create, automate and manage your email backup process on a Linux machine.
In this guide, we rely on two tools: getmail software for reading and saving email messages, and a Linux computer running on Debian operating system (any Linux or other computer that can run getmail will do). Getmail can access messages from POP3 and IMAP4 servers, like Gmail and email systems run by hosting companies.
The overall backup process has two steps:
- Run getmail on your local computer to read new messages from given remote mailboxes.
- Getmail saves the messages into a backup file on your hard drive.
That’s all. A raw backup of all your messages is saved on the hard drive. A raw backup means that messages may not be pleasant to view because there is plenty of extra routing information and html code in the messages, but above all, the actual information is there. Other programs can be used to sanitize messages after backup (I am not covering it in this guide).
How to set up automatic email backup on your computer
Login to your Linux computer as root or apply sudo command for getting the required access rights.
1. Install getmail tool on your computer
Getmail has been around for a long time, and you should be able to find it in your Linux distribution’s software package management system, such as apt-get or yum. For instance:
apt-get install getmail
Will install Getmail on Debian and on many other Linux systems that come with apt-get package manager.
Another way is to download the getmail software from the web site of its developer Pyropus / Charles Cazabon. Installation instructions are available on the web site as well. Manual installation is straightforward: download the tar package, and run the python installation program.
Create configuration directory
Once the software has been downloaded and installed, there is some manual work to do. The default directory for getmail configuration is ~/.getmail. This is the place for definitions that specify the mailboxes to be backed up.
I created the directory in /root/.getmail, because root (or someone with sudo rights) will run getmail later:
mkdir /homedirectory/.getmail
(replace homedirectory with your path)
Restrict access rights to the configuration directory:
chmod 700 /homedirectory/.getmail
Create a mailbox configuration for getmail
While it is possible to use the default name (getmailrc) for a configuration file, it is a good practice to name a file for each mailbox, particularly if you are going to backup more than one mailbox.
You have to create a text file similar to this sample in the .getmail directory (let’s name it /root/.getmail/mailbox1.conf):
[retriever] type = SimpleIMAPSSLRetriever server = the-IMAP.address-of.your-email-host port = 995 mailboxes = ("INBOX", "INBOX.INBOX.Sent") username = mymail@loginname.com password = passwordformail [destination] type = Mboxrd path = /var/mail/mailbox1 user = mailbox1 [options] verbose = 1 read_all = false delete = false message_log = ~/.getmail/getmail-mailbox1.log
If your email hosting service allows POP3 access only, you should specify SimplePOPSSLRetriever in the Retriever Type.
The mailboxes argument tells Getmail to read both Inbox and Sent mail folders so that both incoming and outgoing messages are backed up. The name of the Sent messages folder varies by email server. You may have to study the documentation of the hosted email system, or ask the support if you can’t find the real name of the folder. I discovered mine from the URL address string when I accessed the Sent folder in web mail. If you don’t want to backup Sent messages, you can leave out the mailboxes argument altogether, getmail will read inbox anyway.
Destination Type Mboxrd saves all messages (both Inbox and Sent) for a user into one file on local disk. This was my choice because I wanted the possibility to read backed up messages in Alpine email client, and later split them in separate files.
Destination path /var/mail/ is the default directory for email files in many Linux systems. Another common directory is /var/spool/mail/.
The arguments in Options section tell getmail to download new messages only (read_all = false), and leave messages intact (delete = false) in the hosted server. Changing the option value to true reverses the action in both fields.
There are plenty of additional options available for filtering and routing messages, using external programs to further process messages, and much more in the getmail configuration documentation.
We are not done yet, a couple of tweaks has to be done after you have configured your Gmail backup – if you still use Gmail, that is.
Getmail configuration for backing up Google Gmail messages
Getmail can read messages from your Gmail account and back them up as well. First, you have to switch on POP3 protocol in your Gmail account so that it is possible to access Gmail messages from external mail systems.
Then, create a getmail configuration file, for instance, gmail.conf in your .getmail directory. Google Gmail backup configuration could look like this (as suggested by Lori Kaufman):
[retriever] type = SimplePOP3SSLRetriever server = pop.gmail.com username = yourusername@gmail.com password = yourpassword [destination] type = Mboxrd path = /var/mail/gmail-backup [options] verbose = 2 message_log = ~/.getmail/gmail.log
Add mailboxes argument in the Retriever section with real folder names if you want to save them as well.
Ensure local mail backup file exist and is writeable
After the mail configuration for getmail has been created and saved, a couple of things must be done before testing how it works.
If the mail file doesn’t exist on your computer, create an empty file for the mail user whose configuration file you just specified.
touch /var/mail/mailbox1
Allow getmail write to the file:
chmod 666 /var/mail/mailbox1
If the mail user mailbox1 doesn’t have an account in the system, create one:
adduser mailbox1
Run getmail, and optionally automate the backup
Now, you can try the command that downloads all new messages from your freshly configured mailbox:
getmail --rcfile mailbox1.conf
You can read multiple mailboxes at one go (after you have created mailbox2.conf):
getmail --rcfile mailbox1.conf --rcfile mailbox2.conf
The log file located by default in .getmail directory has information that helps you track any problems.
To automate the backup of new email messages for the mailbox you just created, insert a line something like this into the crontab file:
22 02 * * * getmail --rcfile mailbox1.conf
Every night at 2:22am new messages will be backed up on your local disk.
Header image by Kristina Tripkovic.