Outlook Message Parser

Outlook Message Parser is a small open source Java library that parses Outlook .msg files.

<dependency>
  <groupId>org.simplejavamail</groupId>
  <artifactId>outlook-message-parser</artifactId>
  <version>1.14.1</version>
</dependency>

Outlook Message Parser is a continuation (or fork if that project independently continues) of msgparser.

Under the hood it uses the Apache POI - POIFS library to parse the message files which use the OLE 2 Compound Document format. Thus, it is merely a convenience library that covers the details of the .msg file. The implementation is based on the information provided at fileformat.info.

v1.14.0 - v1.14.1

1.14.1 (08-06-2024): #64: [Bug] Parsing lists to HTML has double bullet points
1.14.0 (25-05-2024): #80: RTF converted to HTML doesn't always detect charset properly

v1.13.0 - v1.13.4

1.13.4 (04-May-2024): bumped apache poi to 5.2.5 and managed commons-io to 2.16.1
1.13.3 (04-May-2024): bumped angus-activation from 2.0.2 to 2.0.3
1.13.2 (05-April-2024): #73 B: Don't overwrite existing address, but do retain X500 address if available
1.13.1 (04-April-2024): #73 A: Further improve X500 addresses detection
1.13.0 (18-January-2024): #71: Update to latest Jakarta+Angus dependencies

v1.12.0 (10-December-2023)

#70: [Enhancement] ignore recipients with null-address

v1.11.0 - v1.11.1

1.11.1 (08-December-2023): #69: Enhancement: instead of ignoring them completely, only ignore for embedded images
1.11.0 (08-December-2023): #69: Enhancement: ignore attachment with missing content

v1.10.0 - v1.10.2

1.10.2 (03-December-2023): #68 Improved heuristics for X500 Names
1.10.1 (24-October-2023): #67 Fixed "possibility to parse X500 Names"
1.10.0 (24-October-2023): #67 Adding possibility to parse X500 Names (dont' use this version)

v1.9.0 - v1.9.6

v1.9.6 (18-July-2022): #57 Same, but now with Collection values to support duplicate headers
v1.9.5 (18-July-2022): #57 Headers should be more accessible, rather than just a big string of text
v1.9.x - a bunch of dependency fixes and tries apparently, my release train was not so smooth here, sorry
v1.9.0 (13-May-2021): #55 CVE issue: Update Apache POI and POI Scratchpad

v1.8.0 - v1.8.1

v1.8.1 (31-January-2022): #41 OutlookMessage.getPropertyValue() should be public
v1.8.0 (31-January-2022): #52 Adjust dependencies and make Java 9+ friendly
v1.8.0 (31-January-2022): #45 Bump commons-io from 2.6 to 2.7

v1.7.10 - v1.7.13 (17-November-2021)

#49 bugfix solved by improved charset handling
#46 bugfix Rare NPE case of producing empty nested outlook attachment when there should be no attachments
#43 bugfix bugfix getFromEmailFromHeaders cannot handle "quoted-name-with@at-sign"
some minor code improvements

v1.7.9 (10-October-2020)

#28 / #36 bugfix NumberFormatException on parsing .msg files

v1.7.8 (4-August-2020)

#35 Clarify permission to publish project using Apache v2 license

v1.7.0 - v1.7.7 (9-January-2020 - 17-July-2020)

v1.7.7 - #34 Wrong encoding for bodyHTML
v1.7.5 - #31 Bugfix for attachments with special characters in the name
v1.7.4 - #27 Same as 1.7.3, but now also for chinese senders
v1.7.3 - #27 When from name/address are not available (unsent emails), these fields are filled with binary garbage
v1.7.2 - #26 To email address is not handled properly when name is omitted
v1.7.1 - #25 NPE on ClientSubmitTime when original message has not been sent yet
v1.7.1 - #23 Bug: _nameid directory should not be parsed (and causing invalid HTML body)
v1.7.0 - #18 Upgrade Apache POI 3.9 -> 4.x

Note: Apache POI requires minimum Java 8

v1.6.0 (8-January-2020)

#21 Multiple TO recipients are not handles properly

v1.5.0 (18-December-2019)

#20 CC and BCC recipients are not parsed properly
#19 Use real Outlook ContentId Attribute to resolve CID Attachments

v1.4.1 (22-October-2019)

#17 Fixed encoding error for UTF-8's Windows legacy name (cp)65001

v1.4.0 (13-October-2019)

#9 Replaced the RFC to HTML converter with a brand new RFC-compliant convert! (thanks to @fadeyev!)

v1.3.0 (4-October-2019)

#14 Dependency problem with Java9+, missing Jakarta Activation Framework
#13 HTML start tags with extra space not handled correctly
#11 SimpleRTF2HTMLConverter inserts too many
tags
#10 Embedded images with DOS-like names are classified as attachments
#9 SimpleRTF2HTMLConverter removes some valid tags during conversion

v1.2.1 (12-May-2019)

Ignore non S/MIME related content types when extracting S/MIME metadata
Added toString and equals methods to the S/MIME data classes

v1.1.21 (4-May-2019)

Upgraded mediatype recognition based on file extension for incomplete attachments
Added / improved support for public S/MIME meta data

v1.1.20 (14-April-2019)

#7 Fix missing S/MIME header details that are needed to determine the type of S/MIME application

v1.1.19 (10-April-2019)

Log rtf compression error, but otherwise ignore it and keep going and extract what we can.

v1.1.18 (5-April-2019)

#6 Missing mimeTag for attachments should be guessed based on file extension

v1.1.17 (19-August-2018)

#3 implemented robust support for character sets / code pages in RTF to HTML conversion (fixes chinese support #3)
fixed bug where too much text was cleaned up as part of superfluous RTF cleanup step when converting to HTML
Performance boost in the RTF -> HTML converter

v1.1.16 (~28-Februari-2017)

First Maven deployment, continuing version number from 1.1.15 of msgparser (https://github.com/bbottema/msgparser)

v1.16

Added support for replyTo name and address
cleaned up code (1st wave)

Name		Name	Last commit message	Last commit date
Latest commit History 265 Commits
.circleci		.circleci
src		src
.codacy.yml		.codacy.yml
.gitignore		.gitignore
.lift.toml		.lift.toml
.travis.yml		.travis.yml
LICENSE-2.0.txt		LICENSE-2.0.txt
NOTICE.txt		NOTICE.txt
README.md		README.md
RELEASE.txt		RELEASE.txt
how to release.txt		how to release.txt
pom.xml		pom.xml
properties-list1.txt		properties-list1.txt
properties-list2.txt		properties-list2.txt
spotbugs-exclude.xml		spotbugs-exclude.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Outlook Message Parser

About

Releases 37

Contributors 8

Languages

bbottema/outlook-message-parser

Folders and files

Latest commit

History

Repository files navigation

Outlook Message Parser

About

Topics

Resources

Stars

Watchers

Forks

Releases 37

Contributors 8

Languages