Home –> Programs (Download) –> mbx2eml | [ ] |
Description
Install
Uninstall
Usage
Features
Names of extracted e-mail files
Settings in the INI file
Note concerning FAT file systems
Background information about the mbox format
References
Credits
License
mbx2eml is a 32 bit program for Windows 95/98/ME/NT/2000/XP/Vista/7, that splits mailbox files in mbox format as well as Foxmail's mailbox files into separate e-mail files (including all attachments), see also Examples of e-mail conversion.
No installation required, just unpack the files in the
ZIP archive to a directory on your hard disk or e.g. on your USB flash
drive. Change the settings in the INI file if
you want.
mbx2eml doesn't add entries to the Windows
registry, nor does it alter anything else on the system.
Just delete the files.
Mbox files typically have the extension MBX (or
sometimes e.g. MBS or no extension at all). These are just conventions.
mbx2eml can process all mbox files, it's not
necessary to rename them beforehand. The mailbox files remain unchanged.
Command: mbx2eml <file specification> <output directory> [options]
( Example: mbx2eml c:\data\*.mbx c:\temp\ /p /n )
In the file specification, use
*.*
for all files. With
*
or
*.
you'll get all files without extension (e.g. Mozilla Mail's and
Thunderbird's mbox files).
mbx2eml splits all mbox files, that match the
given specification, into separate e-mail files. It thereby normally
creates a corresponding subdirectory in the output directory for each
mailbox file. If the output directory already exists, then it should be
empty. Otherwise existing e-mail files with the same name will be
overwritten!
Options:
/p
In case of an error, the program does not ask the user but writtes the
message to the file
%TEMP%\mbx2eml.log
and tries to proceed.
/i
The program starts as an icon in the taskbar.
/n
Serial numbers with 8 digits are used as names of the generated e-mail files
(beginning with 00000001).
/d
All mails are unpacked to the output directory (without subdirectories).
/MinDate=...
Only mails with the given date or a more recent one will be extracted.
/MaxDate=...
Only mails with the given date or an older one will be extracted.
/Message-ID=...
Only mails with the given Message-ID will be extracted.
Within an option there must be no space. You can combine the options
like you want.
The date must be given in ISO format, i.e. {Year}-{Month}-{Day}. So it
looks like this:
/MinDate=2006-04-27.
Only dates in the range from 1980-01-01 to 2100-12-31 are allowed.
The message-ID must be written without angle brackets, and it's not case
sensitive.
Like with the other filters, also here all messages that match
the given criterion are extracted. However, a message-ID normally is
unique. Therefore it is recommended to write an exclamation mark
directly behind a message-ID that you want to search for. With this
addition, the program ends after a matching message was found and
extracted successfully. This can save much time.
When the program ends, it returns one of the following status codes to
the operating system. E.g. batch programs can read this value using the
ERRORLEVEL feature (works on Windows XP, but for some reason not on
Windows 98):
0 Programm successfully
completed.
1 No matching file
found.
2 One or more errors
occured, or the user has prematurely terminated the program.
3 A severe error occured, so
that the program had to be aborted.
Each e-mail file gets a time stamp
according to its Date header field. Since this time stamp is expressed
in Universal Time Coordinated (UTC) for all mails, messages from all
parts of the world (e.g. on a mailing list) can be sorted by time in a
consistent way. If the Date header field of a mail contains invalid
data, then “01.01.1980 00:00:00” is used as time stamp.
mbx2eml can read mbox files that contain
DOS/Windows line breaks (pairs of ASCII characters 13 and 10), UNIX
– including Linux and FreeBSD – line
breaks (ASCII 10), or Macintosh line breaks (ASCII 13), even when they
are all intermixed in the same file. The generated mail files contain
only DOS/Windows line breaks, and each file ends with a line break.
If the last line of a message only contains a dot, then that line will
not be copied to the mail file. This is because there are programs, which
cannot process mail files containing such a line correctly.
mbx2eml locks the mbox files while it reads
them, so that they can't be altered by other programs at the same time.
The program can read even corrupted files that contain binary data. When
extracting messages, every ASCII character 26 (“End of File”
marker) is replaced with the string “<EOF>”. So even
corrupted messages can be opened with any text editor after extraction.
When mbx2eml copies very many mail files to a
disk that has a FAT file system, then it
automatically creates additional directories, if required. Their names
have a trailing underscore and a serial number. This happens both
without and with using the command-line option
/d.
This software has been used to split e.g. mbox files with a size of about
200 MiB, containing more than 40 000 messages. Mbox
files bigger than 2 GiB cannot be processed.
An mbox does not contain file names, so the program must
create them itself.
Normally each mail file is named after its subject. Thereby special
characters, that are not allowed in FAT32 and NTFS file names under
Windows [1], are replaced:
" is replaced with '
: is replaced with .
/\?*<>| and ASCII characters < 32 are
each replaced with a blank
Superfluous blanks as well as certain expressions at the beginning of a
name are removed. Long file names are truncated, so that they don't
exceed a particular maximum. A truncated name is denoted by
“...”. If a mail doesn't have a Subject header field, or the
field body is empty, “[no subject]” is used as file name.
The names of the generated mail files can be changed by means of the
optional INI file.
In order to get a unique name, '_' and a hexadecimal number with
9 digits – representing date, time and
time zone of the mail – is appended to the file
name. This way, we'll almost always get a file name, that only depends
on characteristics of the message itself.
When there are still duplicate names, serial numbers in square brackets
will be added to all names except the first one, e.g.
important_message_2F5B5B73A.eml
important_message_2F5B5B73A[2].eml
important_message_2F5B5B73A[3].eml
With the command-line option /n, serial numbers
with 8 digits are used as names of the generated e-mail files. If you
don't choose a file extension with more than 3 characters, then you'll
get short 8.3 names, which are even valid on DOS.
By means of an optional INI file, the user can change
the names of the generated mail files. The file must be in the same
directory as the program mbx2eml, and its name
must be “mbx2eml.ini”.
Example of a file “mbx2eml.ini”:
-------------------------------------------
[MailFiles]
FileExtension = msg
PrefixesToRemove = Re:, Re^2:, Re^3:, Re^4:
ReplaceInFilename = "%_", " _", ",."
MaxFilenameLength = 50
-------------------------------------------
FileExtension (default: eml)
You can write an arbitrary file extension here, it is used for all
generated mail files. With the setting
FileExtension =
the mail files will not get an extension at all.
The following settings do not apply, when the command-line option
/n is used:
PrefixesToRemove (default: Re:,Re[2]:,Re[3]:,Re[4]:,Fw:,Fwd:,Aw:)
Comma separated list of expressions, which will all be removed from
the beginning of file names. With the setting
PrefixesToRemove =
no expressions will be removed from the beginning of file names.
ReplaceInFilename (default: empty)
You can write an arbitrary number of character pairs here. They must
be surrounded by double quotes, and separated by commas. Each first
character of a pair will be replaced with the second character.
This option is especially useful for IFS, where particular characters
are forbidden, that are allowed on FAT32 and NTFS. That's the reason
why in the example the character '%' is replaced with '_'.
MaxFilenameLength (default: 60)
All characters are counted, including dot and extension (if present).
Valid values are whole numbers between 20 and 120 (inclusive). If the
program reads an invalid number here, it uses the default.
In order to disable an option in the INI file, just turn it into a
comment by putting a semicolon at the beginning of the line. When you
only want to use the default settings of the program, you also can
delete the INI file.
In contrast to the NTFS file system, on FAT file systems
(see “My Computer” > Right click at respective drive >
“Properties”) the number of entries in a directory is limited.
That means if you write too many files into one directory, this directory
sometime will be “full”, even if there is enough free space on
the disk!
One must take into consideration that long file names (LFN) are stored
using a series of linked directory entries. A LFN will use one directory
entry for its short 8.3 name, and a hidden secondary directory entry for
every 13 characters in its long name (including dot and extension). So
if you had e.g. a 120 character long file name, this would use 11
entries.
A file with a short name uses on FAT32 under Windows XP 1 directory
entry, but oddly enough under Windows 98 it uses 2 entries.
E.g. on FAT32 there seems to be a maximum of 65 536 entries
(including “.” and “..”) per directory. Say we
have an mbox file that contains about 20 000 mails, and for the
sake of simple calculation let's assume that the names of all these mails
have the same length. When the whole mbox file should be unpacked to one
directory, the names of the mails must not be longer than 26 characters.
This is a common format for storage of mail messages.
There is no precise specification of it, though. An 'mbox' is a text
file containing an arbitrary number of e-mail messages. Each message
is preceded by a 'postmark', and the messages are formatted according to
RFC 2822 [5]. The file format is line-oriented.
The 'postmark' is a line that begins with the string “From ”
(note the space!), not followed by a colon. Because of the
wide-range of variations in practice, nothing else on the “From ”
line should be considered.
However, this software does not regard every such “From ”
line as the beginning of a new message, because sometimes it is a normal
line in the text body of the mail (e.g. “From now on ...”).
Only if the lines immediately following the “From ” line look
like an e-mail header, then this “From ” line is regarded as
delimiter between two messages. Thereby the program is robust and
recognizes even syntactical incorrect headers, if they are not too
seriously damaged.
Foxmail
Instead of a “From ” line, the program Foxmail uses a line
consisting of ASCII characters 16,16,16,16,16,16,16,17,17,17,17,17,17,83
to denote the beginning of a new mail.
mbx2eml recognizes these mailbox files
automatically, and can process them as well.
Eudora [6]
The Eudora mailbox format is nearly mbox format, but contrary to popular
belief it is not identical to it. It is not supported by
mbx2eml. Unfortunately Eudora uses the file
extension MBX, too.
The Date header field is often left off of Eudora messages, presumably
because it is contained in the initial “From ” line. This
does not correspond to RFC 2822 [5]. Also in contrast
to the mbox format, Eudora extracts all attachments, and saves them as
separate files.
File systems
[1] http://en.wikipedia.org/wiki/Comparison_of_file_systems
Mbox format
[2] http://www.faqs.org/rfcs/rfc4155.html
[3] http://www.qmail.org/man/man5/mbox.html
[4] http://en.wikipedia.org/wiki/Mbox
Internet message format
[5] http://www.faqs.org/rfcs/rfc2822.html
Eudora mailbox format
[6] http://eudora2unix.sourceforge.net/details.html
The program was written in
Euphoria, and translated using the Euphoria To C Translator 3.0.2.
Thanks to RDS for this good, free and open-source general purpose
programming language, and for outstanding support.
The program uses the Euphoria programming library ARWEN 0.93c. Thanks to
Michael <vulcan {AT} win.co {DOT} nz>.
The generated C code was compiled with the
Borland C++ 5.5.1 Command-line Compiler. Thanks to Borland Software
Corporation for having provided this powerful tool free of charge.
For suggestions and bug reports I want to thank Erik Kerger, Ton Kerkers,
Mattias Nyholm, Mark Finney, Marc Schneider, and Dominik Runggaldier.
If you do not accept the following license, then you are
not allowed to use or distribute this software.
1. Copyright
mbx2eml is copyright 2003-2007 by the author
Jürgen Lüthje, all rights are reserved.
2. Right to use
mbx2eml is freeware. You may use the program
free of charge and unlimited in time.
3. Copying
You may copy and distribute the software and its documentation, as long
as the file mbx2e068_en.zip is not modified.
This means, among other things, that you are not allowed to rename the
file, or split it into pieces.
Without clear written permission from the author, you are not allowed
to distribute the program as part of another archive or file.
You are not allowed to sell the program, or to enclose it with a
commercial program or a commercial collection of programs. The program
may be distributed as part of freeware/shareware collections, e.g. on
accompanying DVDs of computer magazines, though.
4. Support
You are not entitled to support by the author. However, the author tries
to answer inquiries by e-mail.
5. Disclaimer
This software is distributed WITHOUT ANY WARRANTY; without even the
implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
The author does not accept responsibility or liability for any effects,
adverse or otherwise, that this code may have on you or your computer.
Use it at your own risk.
Last updated 3. October 2012 – Contact
I am not responsible for the contents of external websites.