Decoding Internet Attachments

A Tutorial by Michael Santovec


Table of Contents For additional help, information and resources, see my ?Technical Help page.

Why Are Attachments Encoded?

Internet e-mail and Usenet news posts were designed for plain text messages. As such, many systems expect the messages to only contain printable characters from the 7-bit (first bit of the 8-bit byte is always zero) ASCII character set. These programs can have problems if the message includes extended 8-bit (the first bit is a one) characters, such as the various accented letters. This also poses a problem for sending files, such as images, sound, video, spreadsheets and programs which can contain any combination of 8-bit binary data. This even poses a problem for formatted documents, since many word processors embed binary control fields in the files.

The way around this limitation is to encode the binary data (attachment) into ASCII characters before sending. To the mail and news systems that the messages travels through, the file is just so much text. At the receiving end, the message is decoded back into the original file, none-the-worse for the experience. Many mail and news programs automate the encoding and decoding. However, sometimes a separate program may be required.

The nice thing about Standards is there are so many to choose from. Encoding is no exception. Among the more popular are: Uuencode, MIME, Base64, Quoted-Printable, Binhex and yEnc. There are other less common methods as well.

It should be noted that encoding is not the same as encryption. The purpose of encoding is to allow some information to be stored in, or pass through, a medium that can't handle the data directly. The purpose of encryption is prevent unauthorized persons from view or using some information. It's possible for a message to use both encoding and encryption.


Top of page Top of section

Uuencode

Uuencode (Unix-to-Unix Encode), as its name implies, comes from the Unix world. It was commonly used to encode files transmitted from one Unix computer to another. Since the early Internet consisted almost entirely of computers running the Unix operating system, it's not surprising that Uuencode is widely used. Today, almost all computer platforms have programs capable of encoding/decoding using Uuencode.

Most mail and news programs can decode Uuencode. However, not all of them can encode it. Most mail systems will pass Uuencode without problems. If you don't know your recipient's capabilities, Uuencode is a good first guess. Uuencode is more common in news than mail. MIME is making inroads in mail faster than news.

Uuencode results in a transmitted message about 42% larger than the original file. This is typical of the encoding penalty.

A Below shows how the image to the right would look if Uuencoded. The first line starts with the word "begin". The "644" represents the Unix file permissions (read/write/execute). This is largely ignored by other operating systems. In this example, "a.gif" is the file name.

The encoded file follows. Most lines begin with an "M" (representing the line length) and 60 characters of data. The last data line is usually shorter, and therefore starts with a different character. The end of the encoding has "`" on a line by itself and then the word "end" on a line by itself.

begin 644 a.gif
M1TE&.#EA)0`H`+,``.P`2QBQ`/__________________________________
M_____________________RP`````)0`H```$5S#(2:N]..O-N_]@*(YD:9YH
MB@(LH%ZM^U;Q3,6L+>6MW@>_5VXW%,J`Q500>5PUF:HE\4F20ITX'#:K-5FG
;WB3MZR&#J^(QM9RVF'7PN'Q.K]OO^+Q^!``[
`
end
Problems can occur due to inconsistent encoding/decoding in different mail and news programs. For example, Microsoft Outlook Express will use a blank (x'20') as an encoding character. (Some other encoders will use the ` character (x'60') instead of a blank.) If the blank ends up as the last character in a line, Outlook Express will then drop the blank resulting in a short line. If Netscape decodes this attachment, it will assume that the short line is padded with nulls (x'00) rather than blanks. This can result with what was orginally a x'40', x'80' or x'C0' byte becoming a 'x00'. This problem only occurs when a x'40', x'80' or x'C0' byte was orinally at the 45th byte of the file, or a multiple there of (e.g. 90th, 135th, etc.).

The file corruption may or may not be apparent. For an image file, a chunk of the image may appear to be off color or otherwise distorted. For an executable file, it may seem to run OK, give some error when used, or give incorrect results. A ZIP file should indicate that it is corrupted when unzipped.

This problem can be avoided if the Outlook Express user uses MIME(Base64) encoding instead of Uuencode. Netscape users can successfully decode the attachment by using manual Decoding with a product such as Wincode or StuffIt Expander, both of which correctly assume that short lines are padded with blanks.


Top of page Top of section

MIME

MIME (Multipurpose Internet Mail Extensions) is not actually a method for encoding attachments. Rather, it deals with the overall structure of a message. A message using MIME doesn't necessarily include attachments. If it does include attachments, they most often use Base64 encoding, or sometimes Quoted-Printable encoding. Theoretically, MIME could even use Uuencode, Binhex, or other methods, but that is both rare and frowned upon in the MIME standards (RFC 2045).

The main advantage of MIME is that it provides a consistent way for the sending program to describe the message contents to the receiving program. The original Internet mail message specification (RFC 822) just describes simple text messages. The message might include an encoded attachment, but it's up to the receiving program to find it in the midst of the message text.

Most newer mail and news programs support MIME. However, older programs don't. And some older mail and news servers either remove or mutilate some of the MIME headers, rendering the message unintelligible to a receiving MIME capable program. Due to its flexibility and power, MIME is the best choice if all parties can handle it.

Some mail and news programs present a choice between Uuencode and MIME encoding. This is a bit misleading and confusing. The Uuencode choice usually means to use a simple mail message (none of the MIME message headers), and to Uuencode any attachments. The MIME choice means to use the MIME headers, and use Base64 or Quoted-Printable for attachments.

The distinguishing characteristic of a MIME message is the presence of the MIME headers. These are normally invisible in a MIME capable reader, but can be seen in the message source. Below are shown some typical MIME headers. The "MIME-Version:" header is present in all messages using MIME. The others are specific to the attachments or other contents. A MIME message may have multiple attachments of various types.

MIME-Version: 1.0
Content-Description: "Base64 encode of a.gif by Wincode 2.7.3"
Content-Type: image/gif; name="a.gif"
Content-Transfer-Encoding: Base64
Content-Disposition: attachment; filename="a.gif"

Although the MIME name specifies "Internet Mail", the same considerations also apply to news. And some parts of MIME are also used by Web Browsers. In particular, the web servers use the Content-Type ("image/gif" in the above example) to identify the type of file being sent to the browser so that the browser can determine how to handle it. However, since the browser protocol (http) supports binary transfers, the encoding issues don't apply there. Microsoft programs largely ignore the Content-Type and give priority to the file extension (the ".gif" in the above example). See the Notes on Mail and News Programs section for how Outlook Express deals with attachments which arrive without a file extension. Some other programs, such as Netscape, give priority to the Content-Type. (For more information on Content-Type, see: RFC 2046 and MIME Types)


Top of page Top of section

Base64

Base64 is the preferred encoding method for attachments in messages using MIME. However, in some cases Quoted-Printable is used instead. Although Base64 could be used without MIME, this is rare.

Base64 results in a transmitted message about 37% larger than the original file. This is typical of the encoding penalty, but slightly more efficient than Uuencode.

A Below shows how the image to the right would look if using Base64 encoding. The MIME headers provide all the descriptive information. This includes the file name, its type, and that Base64 encoding is used.

MIME-Version: 1.0
Content-Description: "Base64 encode of a.gif by Wincode 2.7.3"
Content-Type: image/gif; name="a.gif"
Content-Transfer-Encoding: Base64
Content-Disposition: attachment; filename="a.gif"

R0lGODlhJQAoALMAAOwASxixAP//////////////////////////////////////////////////
/////ywAAAAAJQAoAAAEVzDISau9OOvNu/9gKI5kaZ5oigIsoF6t+1bxTMWsLeWt3ge/V243FMqA
xVQQeVw1maol8UmSQp04HDarNVmn3iTt6yGDq+IxtZy2mHXwuHxOr9vv+Lx+BAA7

Top of page Top of section

Quoted-Printable

Quoted-Printable is used to encode some attachments in messages using MIME. Quoted-Printable leaves printable ASCII characters alone and only encodes those characters (bytes) that might get lost or converted in transit.

If the attachment consists mostly of printable ASCII characters, the MIME program may automatically select quoted-printable over Base64, since this would be much more efficient. In the best case, Quoted-Printable results in a transmitted message only about 3% larger than the original file. However, in the worst case, the transmitted message could be about 200% larger than the original file. So it's important to only use this encoding method on suitable files.

Although Quoted-Printable may be used for attachments, it is more often used for the main message text. The mail or news program may offer the option to encode the text using Quoted-Printable. There are two advantages to this. 1) Characters outside of the normal printable ASCII can be safely transmitted. This includes some special characters, and letters from some foreign alphabets. 2) The intended paragraph layout can be preserved. Simple text messages are arbitrarily chopped into suitable chunks (typically 70-80 characters per line) by the sending program. Quoted-Printable allows the logical lines to exceed the physical line limits of the mail or news transport. It places a hard carriage return (line break) only at the end of a paragraph. The receiving program can then reflow the paragraph to the viewing window. Not all receiving programs support wrapping. This may result in each paragraph being displayed as a single line making the message difficult to read. And some programs will wrap the text for display, but not for printing.

Some mail and news servers may automatically convert any messages that contain 8-bit characters into quoted-printable encoding as the message passes through them.

The following sample text

This is a example of a quoted-printable text file. This might contain some special characters such as:
equal sign =, dollar sign $, or even extended characters such as the cent sign ¢ or foreign characters ÀÆß


is shown below as it would look if using Quoted-Printable encoding. An equal sign "=" at the end of a line indicates a soft carriage-return. The receiving program should remove it and flow the following line into this one. an "=20" at the end of a line represents a Space. Normally, trailing spaces on a line are removed in transit. This causes them to be preserved. And finally, an equal sign followed by 2 hexadecimal characters (0-9, A-F) represent an extended character.

MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

This is a example of a quoted-printable text file.  This might contain =
some special characters such as:=20
equal sign =3D, dollar sign $, or even extended characters such as the =
cent sign =A2 or foreign characters =C0=C6=DF
If the recipient's mail or news program can't handle quoted-printable (many older ones can't), the message will look peculiar with all the equal signs and hexadecimal encoding, but it is still largely readable.


Top of page Top of section

Binhex

Binhex is most often used with the Macintosh. Although Binhex decoders are available for other platforms few people have them. Binhex is a reasonable choice for encoding if both the sender and recipient are using Macs. However, in any other case another encoding method should be used.

Unlike most other methods, Binhex encodes the file name and other information along with the data.

Also, unlike most other methods, Binhex has a built-in compression capability. It's possible that a highly compressible file could result in a smaller transmitted message than the original file. However, you will generally get better results by compressing the file first with a standard compression utility. For an already compressed file, Binhex results in a transmitted message about 40% larger than the original file. This is typical of the encoding penalty, but slightly less efficient than Base64.

A Below shows how the image to the right would look if using Binhex encoding. The first line, rather obviously, indicates the encoding method.

(This file must be converted with BinHex 4.0)
:"@%ZCfPQ!&4&@&4YC'pc!*!&SJ#3"*!!bNG*4MJjB58!+!#c!!$X!%XBX3$rN#S
X!*!%*3!S!!!%9c$)5DZp11[0ZrpJ+)jNDCjSLJ)XS&kYqeEa6-@X,H@YhJHr9fi
h&-U!a933H9`eQDSPm8Q53Tdi($DV09QRhL6Ykb'$Uq)aYCbfQ(A`Z(a1Vp[[q,a
q"!!lU1B!!!!!:
A file encoded using Binhex often has an HQX file extension. If Binhex is used in a MIME formatted message, if usually has a Content-type: application/mac-binhex40. This is a departure from the usual MIME format, in that the Content-type indicates the encoding method rather than the type of data in the file. For more information, see RFC 1741. as well as MacDisk


Top of page Top of section

yEnc

yEnc is a newer (2001) encoding format. It is primarily targeted to large binaries in newsgroups, although it could be used in e-mail. Some news readers support yEnc, but not Microsoft or Netscape.

The main reason for yEnc is its low encoding overhead of about 2%. It does this by using almost all 8-bit characters. However, this can cause some problems. Some servers can't handle the 8-bit characters, resulting in corrupted messages. Some servers may automatically convert any messages using 8-bit characters to either Base64 or Quoted-Printable. This would negate any size advantage of yEnc as well as complicating the decoding for recipients. This is most likely to happen in a MIME formatted message.

A Below shows how the image to the right might look if using yEnc encoding. The first line starts with "=ybegin". The following lines are the encoded attachment. The line length is typically 128 or 256 as specified on the =ybegin line. The last line starts "=yend".

A 8-Bit

For more information on yEnc see yEnc.org, yEnc Tools (encoders/decoders and news readers), yProxy (automatic decoding for news readers), yDecode (automatic decoding for news readers with support for automatic combining of multipart posts and saving attachments), Fidolook (Outlook Express add-on), OeyEnc for Outlook Express and Vista Windows Mail, yEnc considered harmful, and Why yEnc is bad for Usenet .


Top of page Top of section

Other Encoding Methods

In addition to the most common encoding methods discussed above, you might encounter several other methods of encoding attachments, or several things that look similar to encoded attachments. The following should help you identify what you've got.

Binary, 8-Bit, Raw

A Below shows how the image to the right might look if using Binary, 8-bit, or Raw encoding. This does not actually encode the file, but rather includes the data without any conversion. Some mail and news programs allow you to do this or you might be able to paste the file or concatenate it into the message text without conversion. However, Internet mail and news transports don't guarantee to transmit the file without alteration. It's quite possible that the message will get truncated because some combination of characters look like an end of message indication. It might even result in a corrupted mail folder on the mail server or the recipient's mail program. It's unlikely that the recipient will be able to use the file unless it was a text file to begin with. If you receive such a file, you might as well throw it away and ask the sender to try again.

A 8-Bit

BTOA

A Below shows how the image to the right might look if using BTOA encoding. This is rarely seen on the Internet today.

xbtoa5 78 a.gif Begin
7nH003FO36-igRR!:0\Y(pO)@s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s"<
"-M!!";F-ia5M="q]eX1^LYc+F!`.#qhPSnNu_/>-=Oqc<5\`N1ZQXkj;t=)Kr2b(.H1&:%M<RAqS,
'8WlE23#jiW2-Hj6,jjn@K<*ud[@F[mFmung:9WF@pq2%Y!':/\E
xbtoa End N 162 a2 E 26 S 3a3f R 7626cb65

BOO

A Below shows how the image to the right might look if using BOO encoding. This is rarely seen on the Internet today.

A.GIF
AdU6>3UQ9@0X0;<00>`0BaRa0?oooooooooooooooooooooooooooooooooooooooooooooooooo
ooooob`0~39@0X~215L`b4V[_CS[cK_oH2R>I6VNJ8X2;:1N[O]FlDc5[2gU[Mh7_eM^=aC:P<ED
47UL=IVZ9O59TT:M>1`fZcEIYmhTkN\QPj_R<KFL]YQel;QlCZoKkoRlOP@0>`00

ROT-13

ROT-13 is not a encoding format for attachments. It is a simple encryption for text. It Rotates each letter of the alphabet 13 positions. "A" and "N" are exchanged, "B" and "O" are exchanged, etc. Numbers, spaces and punctuation are not changed. Because it is so simple, its purpose is not security. Rather, it is used so that others don't accidentally read a message that they don't want to. It was most often used for messages of questionable taste. Some news readers have ROT-13 decoding built-in. It is rarely used on the Internet today. Below is a sample of a message using ROT-13 encoding.

Guvf vf n fnzcyr bs n zrffntr rapbqrq hfvat EBG-13 rapbqvat.
Orpnhfr bs gur fvzcyr angher bs gur rapelcgvba, vgf checbfr 
vf abg frphevgl ohg gb cerirag nppvqragny ernqvat.


Top of page Top of section

MS-TNEF WINMAIL.DAT Attachments

Mail programs in the Microsoft Exchange family, which includes Windows Messaging, Outlook97, Outlook98 and Outlook2000+, will include a TNEF (Transport Neutral Encapsulation Format) attachment named WINMAIL.DAT when the sender selects, or defaults to, RTF (Rich Text Format). If the sender is using MIME formatting, this attachment will have Content-Type: application/ms-tnef.

The TNEF attachment includes a Rich Text Format (e.g. bold, underline, fonts) version of the plain text message. If the sender has included any attachments (e.g. pictures, spreadsheets, programs), they will be embedded within the TNEF attachment and not as separate attachments.

Most other mail programs do not know how to handle the TNEF attachments and so Exchange family users should avoid using RTF unless they know that the recipient has a compatible program. The sender can control the use of RTF on a recipient by recipient basis. However, if sending via a Microsoft Exchange server, the server can override the sender's settings. In this case the Exchange administrator will need to make changes on the server. For more information on this, see the following articles: How Message Formats Affect Internet Mail, HOWTO: Force a Particular Internet Encoding by Using MAPI with Exchange, XCLN: Sending Messages In Rich-Text Format, XFOR: Preventing WINMAIL.DAT Sent to Internet Users, Sending RTF with Attachment as MIME Loses Attachment, XADM: POP3 Users may not Receive an Attachment if Part of DL, XFOR: CR Receives Rich Text Format Information Unexpectedly, Winmail.dat attachments are included in received e-mail messages in Outlook 2002 or 2003, E-mail attachments are not visible to some recipients (Outlook 2003/7) .

Fentun

Fentun is a freeware TNEF Attachment Extractor. It is available for the Win95/98/NT4 and Linux operating systems. It does not show the RTF message embedded in the TNEF attachment, but does let you see if there are any other attachments within the TNEF attachment and lets you save them.

For Netscape Mail, Fentun can be installed as a Helper. Instructions are at the Fentun web site.

For instructions on using Fentun with the Pegasus Mail program, see: Pegasus and MS-TNEF.

If the Fentun author's web site is unavailable, a copy the the Win95/98 version, along with notes, is available for download here.

If you download this MS-TNEF.REG file, and run it on a Win95/98 system, it will create a file association for *.TNF files to Fentun, in order to make Fentun easier to use with other mail programs. Depending on where you install the FENTUN.EXE program, you will need to either edit the path in the registry file before running, or else update the path in the file association after running. For details, see the comments in the MS-TNEF.REG file.

Users of Microsoft IE3 Internet Mail, IE4+ Outlook Express, and Windows Live Mail will not see any indication that they have recieved a TNEF attachment. Apparently, Microsoft has decided that since these programs can't handle a TNEF attachment, it will be hidden. In order get these programs to decode the TNEF attachment and make it available to Fentun via the above file assocation, perform the following steps:
  1. Save the whole e-mail message to an *.EML file via File, Save As, or Drag-and-drop
  2. Open the *.EML file in a text editor such as Wordpad or Notepad
  3. Locate the line Content-Type: application/ms-tnef;, and change the Content subtype to something else, such as Content-Type: application/ms-tnefx;
  4. Locate the line filename="winmail.dat" and change to filename="winmail.tnf". You may change the winmail part as well, if desired
  5. Save the *.EML file from the text editor
  6. Double click the *.EML file from the Windows File Explorer. This will open the message in a mail window and the TNEF attachment should now be available for saving or opening.
LS-TNEF

LS-TNEF is a Java based TNEF Decoder. The LS-TNEF API allows one to decode the TNEF file from the command line, via the API or by using Sun's Java Mail API. A version of LS-TNEF for the Mac OS is available at iTools on the public iDisk of user "jlg" (a free iTools account is required).

TNEF's Enough

Mac Development TNEF's Enough is a Macintosh specific TNEF decoder.

WMDecode

WMDecode is a Windows specific TNEF decoder. The site also has WinMail to Extract attachments from winmail.dat files received on the EPOC32/Psion/Symbian Operating System.

Winmail Opener

EOLSOFT Winmail Opener is a GUI and command line utility to deal with the winmail.dat file (TNEF message), it can extract the RTF message text and any attachments. It supports Windows 95/98/ME, Windows NT, Windows 2000 or Windows XP.


Top of page Top of section

HTML - Web Pages

HTML (Hyper Text Markup Language) is used to describe web pages. A web page consits of one or more files. The main file is a text file that contains the HTML formatting codes, usually most or all of the page text, links to any other web pages and links to any images, animations, sound clips, etc. Each image, animiation, sound clip, etc. is a separate file.

Some, but not all, mail and news programs can display HTML messages. Most that can display HTML require that the message use MIME formatting and specify Content-Type: text/html for the HTML message portion (e.g. Microsoft and Netscape). Some ignore the Content-type and merely look for the <HTML> tag in the message body to decide if to interpret the message as being HTML (e.g. Eudora). If your message displays as the raw HTML code (see the sample below) either your program doesn't support HTML, or the Content-type was wrong. It is somewhat common for mailing lists and spam to come with a Content-Type: text/plain even though the message contains HTML code.

Many programs that create HTML messages specify Content-Type: multipart/alternative and include two copies (attachments) of the message text. The first is the plain text version of the message (Content-Type: text/plain), the second is the HTML version (Content-Type: text/html). If a receiving program understands the MIME multipart/alternative and HTML, it will display the HTML version in the message body and hide the plain text one. If it doesn't understand HTML, it will display the plain text version in the message body and hide the HTML one. If it doesn't understand multipart/alternative it may display either or both message copies as attachments. And if it doesn't understand MIME, it will display both copies in the message body, but the HTML version will be difficult to read because it will be the raw HTML with all the formatting codes displayed.

Below is "A simple HTML message.":
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_0059_01BEA6E2.1A467F40"

This is a multi-part message in MIME format.

------=_NextPart_000_0059_01BEA6E2.1A467F40
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

A simple HTML message.

------=_NextPart_000_0059_01BEA6E2.1A467F40
Content-Type: text/html;  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD W3 HTML//EN">
<HTML>
<HEAD>
<META content=3Dtext/html;charset=3Diso-8859-1 =
http-equiv=3DContent-Type>
<META content=3D'"MSHTML 4.72.3110.7"' name=3DGENERATOR>
</HEAD>
<BODY>
<DIV>A simple <STRONG>HTML</STRONG> message.</DIV>
<DIV> </DIV></BODY></HTML>

------=_NextPart_000_0059_01BEA6E2.1A467F40--

The HTML capabilities of mail and news programs come in 3 levels:.
  1. HTML Text Only Some programs can only display HTML text and links. They cannot display any referenced images, play sound clips, etc. Links to images and such might just appear as a link, or not at all.
  2. HTML Text and External Images Some programs can display images if the HTML contains a link to an external source, such as a web site or a corporate file server. If the image link is to a web site, then the recipient must have an open Internet connection while reading the message for the image to be displayed. If the image link is to a corporate web server, then the recipient must have access to the web server while reading the message, and use the same drive mappings as the sender. If the image link is to the sender's local disk drive, then the recipient would have to have a copy of the image file already on their own local disk drive in the corresponding directory. Support for links to sound clips may be limited or non-existent.
  3. HTML Text, External and Internal Images Some newer programs support Content-Type: multipart/related which also allows the linked images to be attachments in the same message as the HTML code. The advantage is that the recipient doesn't need a live Internet connection or access to a file server to see the images while reading the message. One disadvantage is that the mail or news message becomes much larger in size. Additionally, if such a message is sent to a recipient with only level 2 HTML support, they won't see the images within the message. They may see the images as separate attachments, or not at all.
If a message's HTML features exceed the capabiliities of the recipient's mail or news program, it is possible to save the HTML portion of the message to an HTML file (.HTM or .HTML file extension) and then open in the web browser. If the message used multipart/related, the images will need to also be saved, possibly requiring manual Decoding. The HTML file will require editing if multipart/related was used because the links are not normal file names. Even in other cases, some edting may be required to make the HTML file usable. Overall, this is a lot more bother than it is probably worth.


Top of page Top of section

Macintosh Notes - AppleSingle and AppleDouble

Some Macintosh files consist of two parts called forks: A Macintosh mail or news program could encode attachments one or more of the following ways. Some may offer a choice, some will support only one of the following. AppleSingle and AppleDouble attachments normally use MIME formatted messages with Base64 encoding. The Data Fork only attachments could use Base64, Uuencode or Binhex. For example, if the Macintosh program gives you encoding options of AppleDouble, Base64 and Binhex, the Base64 and Binhex options likely send only the Data Fork.

When a non-Macintosh user receives an AppleDouble attachment, they will most likely see two attachments. Both attachments might have the same name (e.g. photo.jpg), or the first attachment (Resource Fork) might have a generated name (e.g. att0001.dat) while the second (Data Fork) has the real name (e.g. photo.jpg). The first attachment is usually small (less than 10 KB). They will likely get an error if they try to open the first attachment (unknown file type or invalid file format). They should ignore the first attachment and just open/save the second one.

You can verify the presence of AppleSingle or AppleDouble encoding by looking at the message source. AppleSingle will have a single attachment with a Content-type: application/applefile. AppleDouble will have a header with a Content-type: multipart/appledouble followed by the Resource Fork attachment with Content-type: application/applefile followed by the Data fork attachment with a Content-type that depends on the actual file type (e.g. Content-type: image/jpeg). For more information, see RFC 1740.

A MacBinary file (.BIN) is similar to AppleSingle in that it combines the data and resource forks in a single file. This isn't a normal attachment encoding but rather is used when transfering Macintosh files via another operating system as in storing on a file server or web site. There are utilities for extracting the data fork including StuffIt Expander (see: Decoding Mechanics)

See the MacDisk Hints and tips on conversion for help on dealing with application files transferred between a PC and a Macintosh.


Top of page Top of section

Identifying Attachment File Types and Setting File Associations

Usually the file name of an attachment indicates the type of file it is. For example, a file named A.GIF has a file extension of GIF which indicates that it is probably a GIF image file. Knowing the type of file allows you to select an appropriate application program to open the file in.

Windows 95 and above may be configured to hide file extensions. This can make it difficult to identify the file type. This may also make it impossible to change (rename) the file extension of a file in order to fix it. To have Windows show the file extensions:
  1. Double-click My Computer
  2. Click on the View menu (Win95/98/NT4/Me) or Tools (ALT+T) menu (Win2000/XP/Vista/Win7)
  3. Click on Options (or Folder Options)
  4. In the pop-up window, click the View tab
  5. Then click the "Hide file extensions for known file types" (or "Hide MS-DOS file extensions for file types that are registered") check box to clear it.
  6. Click OK, and then close My Computer
The following sites have lists of file extensions and the type of file it could be: Joz's Extensions Base, CKNOW.COM FILExt, Common Internet File Formats, WhatIs.com, Windows Media Player Multimedia File Formats, File Extensions Windows/OS2/Apple/UNIX, FileInfo.com, Dot What?!, The Sharpened.net Help Center, Signatures of Macintosh Files, File Extension Seeker .

If the file extension is .ZLx (where x is 0-9 or A-Z) or .Z0 or .Z1, or contains DEFANGED-, see the Viruses section.

However, file extensions are not necessarily unique. The same file extension may be used by different types of files. For example, a DAT, ATT or TMP file could be just about anything. In some cases, the attachment arrives with an incorrect file extension. It may be necessary to take a look at the contents of a file to determine what it is. Often you can use a text or word processor, such as the Windows Notepad or Wordpad, to look at a file.

See the Notes on Mail and News Programs section for how Outlook Express, Windows Mail and Windows Live Mail deal with attachments which arrive without a file extension.

If the file is mostly a jumble of letters and numbers, it may need manual decoding. Comparing the file to the examples in Uuencode, MIME, Base64, Quoted-Printable, Binhex, yEnc and Other sections should allow you to identify the type of encoding used. See the Decoding Mechanics section for how to handle these files.

If the file appears to be mostly strange characters, there may be a few letters in the mix that let you identify the type of file it is. For example, if the file appears in your word processor as
A 8-Bit
the GIF as the first 3 characters identifies this as a GIF image file. The identifying characters don't necessarily match the file extension and they aren't necessarily the first 3 characters. This varies by file type.

The following is a list of some common file types and the identifying information. IrfanView is a free image viewer and converter that can handle most popular image types including BMP, GIF, JPG, PCX, TIFF, etc.

If the file is a Microsoft Office file (Word, Excel, Access, Power Point), the following free utilities may be useful: Microsoft Office Converters and Viewers

Setting File Associations

Once you've determined the correct program to use with a particular file type, you will want to set the file association. Windows uses the File Associations to select the default program to use to open a particular file type. This is used when you double-click a file to open from the Windows File Explorer or My computer. It's also used by some browsers, such as Internet Explorer, and some mail and news programs, such as Outlook Express and Windows Live Mail. Other mail and news programs, such as Netscape, have their own configuration settings for this.

Many applications when installed, or reinstalled, will set the file ssociations to point to that program for the file types that they can handle. Some will do this without asking. Some will ask and may let you choose which file types the program should handle.

The setting/changing of the Windows File Associations varies somewhat by the version of Windows. For the specifics see: Win95/98, WinMe, WinNT4, Win2000 or Win2000, WinXP, Vista/Win7

The following procedure will work with the various versions of Windows to set a basic file association (the exact wording may vary by Windows version)
  1. Locate a file of the type you want to do the file association for in My Computer, the Desk top or Windows Explorer
  2. Right click the file and select Open With
  3. Select Choose Program
  4. If you don't see the desired program in the list presented, select Browse or Other to search for it
  5. Once you have selected the desired program, be sure to check the box Always use the selected program to open this kind of file
  6. click OK
If you need to change an option in the file association, such as Confirm Open After Download, MIME Content-Type, or the Application Used needs a parameter (for example opening EML or EMAIL files in Outlook Express requires that the application be: "C:\PROGRAM FILES\OUTLOOK EXPRESS\MSIMN.EXE" /eml:%1 ) you'll need to directly access the file associations for making the change. Many applications require the "%1" parameter to properly open files with long file names as often happens when opening attachments from the Temporary Internet Files. As an example see Cannot Find the File. Although that article specifically mentions Wordpad on Win95, it applies to many applications on Win95 and newer.
  1. Double-click My Computer
  2. Click on the View menu (Win95/98/NT4/Me) or Tools menu (Win2000/XP)
  3. Click on Options (or Folder Options)
  4. In the pop-up window, click the File Types tab
  5. Scroll down to the desired File Type
  6. Click on EDIT (Win95/98/NT4) or CHANGE or ADVANCED (WinME/2000/XP). If you see RESTORE instead of ADVANCED, that means that the association has been changed from the Windows default program. Clicking RESTORE will reset the assocaition. That may solve your problem. If not, you'll then have the ADVANCED option and can make any specific changes needed.
  7. Changes to Confirm Open After Download or MIME Content-Type can be made here. If you need to change the Application Used, Select the OPEN action, then click EDIT
In the case of Windows XP, the MIME Content-Type setting is not accessible via the File Types. You need to edit the Registry. For a REG file that fixes the Content-Type of common image files types, see: Some types of images don't display

Windows Vista and Windiows7 do Not have the above access to the File Types dialog. There is a new facility in these versions of Windows that may solve your problems. Go to Control Panel there select Programs then Set your Default Programs. Then select the desired program and Choose Defaults for this program. For more information on this, see How do I... Change file extension associations in Windows Vista? (the same procedure applies to Windows7). If you need to make more changes then this allows, you will need to edit the Registry (not for the faint of heart). or use a File Association utility. For more on this see Windows Vista File Associations Advanced Editing Management Tools and FileTypesMan.

You may run across a problem where you can open a saved attachment by double clicking it, but if you try to open it from within some mail/news programs, you get an error message that there is no file association for the file type. This is due to some security changes starting with Windows XP with SP2. Outlook Express, Windows Live Mail and the Vista Windows Mail programs require that the Windows File Association have an OPEN Action. Some viewer applications just install a default action such as SHOW or DISPLAY. Adding an OPEN Action with the same settings as the default action will allow you to open the file from within these programs rather than having to save the attachment and open that. An example of this problem is seen with some versions of the PowerPoint Viewer. For that specific case, see: Unable to open .PPS attachments directly from Outlook Express and Unable to open .PPS attachments directly from Windows Mail in Windows Vista. A similar issue can be seen with Acrobat PDF files Unable to open .PDF attachments from Windows Mail. The issue/fix is not specfic to a particluar mail program but the Windows File Associations. So for example, if Outlook Express has the problem with PPS files on Windows XP, so will the Windows Live Mail program on the same PC. The Outlook Epxress fix will fix the problem for both programs. Likewise, the PDF fix for the Vista Windows Mail program will fix the same issue in Window Live Mail there.

If the MIME Content-Type is missing or incorrect in the File Assocaitions, you generally won't notice that. However, there are a few cases where it may be important to be correct If you are having trouble with file associations, the FileExtInfo utility will gather all the file association information from the Windows registry for a specified file extension and save it to a file named FileExtInfo.txt on the desktop. This can be shared with a technician to help diagnose the problem.


Top of page Top of section

Viruses, Worms and Trojan Horses

Viruses, Worms and Trojan Horses are programs designed to do things to your computer that you don't want done (e.g. deleting files, stealing information, sending messages to others, etc.). Although there are technical differences amongst these, for simplicity they will all be referred to here as viruses. They are often distributed through Internet Attachments.

In order for the virus to do its damage ("infect" your computer) the instructions that make up the virus must be executed by your computer. This is done when you Open (run, execute) the file containing the virus in an application that expects the file to contain instructions, including macros. If you just save the file, or copy it, it won't cause any problems. (However, you have to be careful, since operating systems make it all too easy to accidentally open a file.) Likewise, if you open a file in an application that doesn't expect any instructions and therefore won't execute them, this is also safe. For example, Windows Notepad expects a simple ASCII text file and wouldn't execute any instructions or macros if it found them. Also, simple image files, such as GIF, JPG and BMP are data only. Image display programs, including browsers and mail programs, don't look for any instructions in these either, so they are generally safe to open. (An exception occurs if the display program has a bug whereby a carefully crafted image causes the program to crash and inadvertently execute some of the image as code. For an example of this and the fix, see: Security Update for JPEG Processing.) However, Word and files can contain macros, so these are risky to open in their applications. For some additional information on viruses, including anti-virus programs, see the Internet Security section of my Technical Help web page.

To protect yourself from viruses Complications

Measures taken to stop viruses can interfere with attempts to send and receive legitimate attachments.
Top of page Top of section

Compression and Message Size Limits

Sometimes it is a good idea to compress the files before attaching. The advantages of this are:
However, since compression does require a little extra work on both the sender's and recipient's part, there are times when it isn't worth the bother. If the file size is small (e.g. less than 40 KB) or a file type that doesn't compress well (see above) and you are only sending one or two files, I wouldn't compress them. Also take into consideration whether the recipient will know how to uncompress the file.

There are several compression methods

ZIP

This is the most common compression method on DOS and Windows PCs. Programs are also readily available to handle Zip files on other platforms. This is the best choice of a compression method for DOS and Windows, as well as cross-platform file transfers. PKWARE PKZIP/PKUNZIP is the standard DOS program for ZIP files. They also have versions for Windows, OS/2 and Unix. There are also numerous DOS and Windows front-end programs available to make the use of the DOS PKZIP easier. Winzip is a popular Windows base Zip/Unzip program, that also has decoding abilities. Info-ZIP has freeware Zip and Unzip programs for over 30 operating systems. 7-Zip is a freeware Zip and Unzip program. It also handles TAR, RAR, AJR and some other formats.

RAR

RAR is shareware compression program available on a varietry of platforms from RarSoft. UNRAR is a freeware uncompress program for RAR files available from the same location.

StuffIt - SIT

Aladdin Systems StuffIt is commonly used on the Macintosh. Unstuffing utilities (e.g. StuffIt Expander) are available for DOS and Windows PCs, but few people have them. Utilities for creating SIT files are not commonly available on these platforms. StuffIt Expander can also uncompress ZIP, ARC and ARJ files as well as decode Uuencode and Binhex and extract MacBinary files.

TAR - Compress

TAR (Tape ARchive) is the standard archive for Unix systems. This is often combined with the standard Unix utility "compress". Compressed Unix files typically have a file name ending in ".Z". TAR and Unix "compress" compatible utilities are available for other platforms, but few people have them.

Other

Other compression methods include ARC and ARJ. These are not commonly used today. Although there are programs available to specifically uncompress these types of files, often utilites for dealing with other compression methods will also handle these file types.

For ZIP and other utilities, try these sites: ZDNET Hotfiles, SHAREWARE.COM, c|net Download.Com, Filez.

File Transfer/Sharing Web Sites

As an alternative to e-mailing large files, there are a number of web sites that let you upload your file to them and then send an e-mail link for the file to your recipient who can then download the file when they desire. This lets you get around the message and mailbox size limits of the sender's and recipient's ISPs, may result in faster upload and download since encoding is not required and may be more convenient for the recipient. Such web sites include SkyDrive, dropload.com and yousendit.com


Top of page Top of section

Encoding Mechanics

Most newer mail and news programs provide automatic encoding of attachments. You merely select the menu item or button for Attachment and then select the file to attach. Or your operating system may provide a drag-and-drop or other method to send a file. If your mail or news program supports more than one encoding method, there will be an option to set the default encoding method. There may also be the option to override the default encoding method when composing a message. (Note: Some programs support more decoding methods than they do encoding methods. For example, a program might always use MIME encoding in sending, but be able to decode either MIME or Uuencoded attachments on receipt.)

There are two cases where you might need or want to manually encode an attachment: your program doesn't support attachments or the recipient's program can't decode any of the encoding methods that your program supports.

Because MIME headers are integrated with the message headers, it would be difficult, if not impossible, to manually insert a MIME encoded attachment in an existing message such that the recipient's program would automatically decode it. For this reason, Uuencode would be the best choice for manual encoding.

When you manually encode a file, the encoder program produces a plain text file. For example, the A.GIF file might get converted to A.UUE. ".UUE" is a common extension for Uuencoded files. However, the exact name is unimportant since it should appear nowhere in the sent message. Once you have the encoded file, you need to insert it as text into the body of the message. The encoded file should appear along with your message text, as in the following example.

Hi Bob,
Here's the image file that I promised you.

begin 644 a.gif
M1TE&.#EA)0`H`+,``.P`2QBQ`/__________________________________
M_____________________RP`````)0`H```$5S#(2:N]..O-N_]@*(YD:9YH
MB@(LH%ZM^U;Q3,6L+>6MW@>_5VXW%,J`Q500>5PUF:HE\4F20ITX'#:K-5FG
;WB3MZR&#J^(QM9RVF'7PN'Q.K]OO^+Q^!``[
`
end
Some people include a line, such as "------ CUT HERE -----" before and after the encoded text. This is unnecessary. If the recipient's program can decode the attachment type, it won't use those lines. If the recipient has to manually cut the encoded text for decoding, after they've done it once, it will be obvious to them what needs to be cut.

Notes: If you are manually encoding a file, be sure to Insert as Text and not Attach the encoded file. If you attach the encoded file, it gets encoded a second time. Not only does it make the resulting message larger than necessary, but defeats the purpose of manual encoding. If the recipient could decode your attachments in the first place, there would be no need for the manual encoding. By attaching an already encoded file, you are forcing the recipient to double decode it.

If you manually Uuencode an attachment and insert that into a message that is using MIME, that may allow a recipient whose program only supports Uuencode to automatically decode it. However, if the recipient's program supports MIME, it may not automatically decode such a message, even if that program also supports Uuencode. That's because many programs don't expect MIME and Uuencode to be used in the same message. They only look for Uuencoded attachments in messages without any MIME headers.

If you are going to use compression, you compress the original file first. Then either attach the compressed file (if using automatic encoding) or encode the compressed file (if using manual encoding).


Top of page Top of section

Decoding Mechanics

Most newer mail and news programs provide automatic decoding of attachments. However, your program might not support the encoding method used by the sender. If your program can handle the decoding of more than one method, it will usually automatically detect the message's method.

If your program automatically decoded an attachment, it will do one of several things, depending on the program, your options, the type of file and/or how it was attached.
You might need to manually decode an attachment. The reasons for this include: The sender used an encoding method not supported by your mail or news program; your mail or news server doesn't support MIME and removed some critical MIME headers; the sender double encoded the attachment (see Encoding Mechanics); and the original message was split into multiple parts (see Compression and Message Size Limits)

If your mail or news program decodes an attachment, but it needs further decoding, use the program's save attachment feature, then use the manual decoder on that file (for an exception, see Multipart Messages below). Otherwise you need to save the raw message as a plain text file. Most programs have a File, Save As option to save a message to an external file. Although they may give the file a special extension, these are normally plain text files. Your manual decoder program might expect that the file has a special extension, such as ".UUE", but this is usually not necessary.

It's a good idea to look at the saved file in your word processor to see what you've got. Comparing the file to the examples in Uuencode, MIME, Base64, Quoted-Printable, Binhex, yEnc and Other sections should allow you to identify the type of encoding used. If the file does not look like any of the encoding formats, see the Identifying Attachment File Types section for further help.

Depending on the manual decoder program that you use, you may need to do some editing of the saved file before decoding. You may need to remove message headers (e.g. From:, Subject:, etc.) and normal message text. However, MIME decoders generally expect the message headers, and Binhex decoders expect the "(This file must be converted with BinHex 4.0)" line. Some better decoder programs (e.g. Wincode) do a good job of ignoring what they don't need, so that you rarely need to edit the file before decoding.

You might also need to tell the decoder program the type of encoding used, or select a different decoder program based on the type of encoding. Wincode version 2.7.3 or later does a good job of determining the encoding type if set to "AUTO-1" decoding. (However, some files use Base64 with incomplete or no MIME headers. In that case you will need to manually strip the file of any headers before decoding and set Wincode to decode Raw Base64.)

Multipart Messages

Because of message size limits imposed by some ISPs (see Compression and Message Size Limits), larger attachments may have been split into multiple messages. Since decoding is the reverse of encoding, you must perform the steps in the reverse order. The original file was first encoded, then split into multiple messages. So you must first combine the multiple messages, then decode it.

Some news programs (but fewer mail programs) may automatically identify the message parts, combine them and decode. Some others may allow you to identify the parts and order them, then the program will decode it. However, in many cases, you will need to manually save the parts and then decode it.

Your mail or news program may automatically decode the first message part. However, that doesn't do you any good. There is no practical way to combine that already decoded part with the rest of the parts. You will need to save each part (including the first) as a plain text file (as discussed above), then decode that. Your program may allow you to select all the parts and save as a single text file. If not, save each message part as a separate file. If you give the individual parts names such as FILE1.UUE, FILE2.UUE, FILE3.UUE, etc. then tell Wincode to decode FILE1.UUE, it will automatically find the other parts for decoding. For some decoders it may be necessary to use your word processor (or other program) to combine the individual parts into a single file before decoding.

For Wincode and other decoding utilities, try these sites: ZDNET Hotfiles, SHAREWARE.COM, c|net Download.Com, Filez. Winzip also includes decoding functions. Aladdin Systems StuffIt Expander can also decode Uuencode and Binhex. DeBinHex decodes Binhex files. For yEnc attachments, see yEnc.


Top of page Top of section

Problems and Complications

If there weren't problems and complications, you wouldn't be reading this. Many of these are caused by incompatibilities between the sending and receiving program. Another source of problems is the sender's lack of understanding of encoding. One or more of the following may apply to the problem message.

Top of page Top of section

Notes on Mail and News Programs

Disclaimer: Information in this section is based on rumor and innuendo. I don't use most of these programs. Program features are subject to change without notice. Different versions of a program on the same or different platforms may have different features. For example, the Macintosh version might support both encoding and decoding using Binhex, while the Windows version might support only decoding Binhex, or not support Binhex at all. If you have more accurate or up-to-date information on these or other major mail and news programs regarding their support for Attachments, drop me a note. I might even update this page with that information.


Top of page Top of section

Notes on Service Providers

Disclaimer: Information in this section is based on rumor and innuendo. I don't use most of these services. Service features are subject to change without notice. Users on the same service may have different software or versions. The service on different platforms may have different features. If you have more accurate or up-to-date information on these or other major services regarding their support for Attachments, drop me a note. I might even update this page with that information.

Some of the links below are restricted to members of the service.


Top of page Top of section


Last updated: 2010-04-17