Apache Tika – Security
Security
The following is an incomplete list of known and fixed Critical Vulnerabilities and Exposures (CVEs) and other vulnerabilities in Apache Tika or its dependencies. Please help us fill this in with more details.
See also see our
security model
CVE or Vulnerability
Description
Reporter
Affected Versions
CVE-2025-54988
XXE when parsing XFA via the PDFParser
Paras Jain and Yakov Shafranovich of Amazon
1.13-3.2.1
CVE-2023-42503
commons-compress uncontrolled resource consumption vulnerability while parsing tar files
???
???->2.9.0
CVE-2022-33879
Regex DoS in StandardsExtractingContentHandler; incomplete fix for CVE-2022-30973/CVE-2022-30216 and a new one
Tony Torralba, Jaroslav Lobačevski and Tim Allison
???-2.4.0 and ???-1.28.3
CVE-2022-30973
Regex DoS in StandardsExtractingContentHandler; missed fix in 1.28.2
Cathy Hu, SUSE Software Solutions Germany GmbH
???-1.28.2
CVE-2022-25169
BPGParser Memory Usage DoS
???
???-2.3.0 and ???-1.28.1
CVE-2022-30216
Regex DoS in StandardsExtractingContentHandler
CodeQL team members Tony Torralba and Joseph Farebrother
???-2.3.0 and ???-1.28.1
CVE-2021-44832
Remote Code Execution via JDBC Appender in log4j2
???
2.0.0-BETA-2.2.1
CVE-2021-44228
Critical Remote Code Execution in log4j2
???
2.0.0-BETA-2.1.0
CVE-2021-31812
Infinite loop when loading a crafted PDF in PDFBox before 2.0.24
Chaoyuan Peng
?-1.26
CVE-2021-31811
OutOfMemoryException when loading a crafted PDF in PDFBox before 2.0.24
Chaoyuan Peng
?-1.26
CVE-2021-28657
Infinite loop in the MP3Parser.
Khaled Nassar
?-1.25
CVE-2021-27906
Out of memory error while loading a file in PDFBox before 2.0.23.
Fabian Meumertzheim
?-1.25
CVE-2021-27807
Infinite loop while loading a file in PDFBox before 2.0.23.
Fabian Meumertzheim
?-1.25
CVE-2020-9489
System.exit vulnerability in Tika's OneNote Parser; out of memory errors and/or infinite loops in Tika's ICNSParser, MP3Parser, MP4Parser, SAS7BDATParser, OneNoteParser and ImageParser.
Tim Allison
1.0-1.24
CVE-2020-1950
Excessive memory usage (DoS) vulnerability in Apache Tika's PSDParser
Pierre Ernst
1.0-1.23
CVE-2020-1951
Infinite Loop (DoS) vulnerability in Apache Tika's PSDParser
Tim Allison
1.0-1.23
CVE-2019-10094
StackOverflow from Crafted Package/Compressed Files in Apache Tika's RecursiveParserWrapper
Tim Allison; files contributed by Matthew Barber and Erling Ellingsen
1.7-1.21
CVE-2019-10093
Denial of Service in Apache Tika's 2003ml and 2006ml Parsers
Tim Allison
1.19-1.21
CVE-2019-10088
OOM from a crafted Zip File in Apache Tika's RecursiveParserWrapper
RunningSnail
1.7-1.21
PDFBOX-4550
OOM from corrupt ToUnicode stream in PDFs
Tilman Hausherr
?-1.21
CVE-2019-0228
XML External Entity (XXE) in xfdf loading in PDFBox (regular Tika parsing would likely not be vulnerable)
Kurt Boberg
?-1.20
CVE-2018-20346
(Provided) SQLite before 3.52.3 allows remote attackers to execute arbitrary code
Pat Cashman (notified Tika team)
?-1.20
CVE-2018-17197
Infinite Loop in Tika's SQLite3Parser
Tim Allison
1.8-1.19.1
CVE-2018-11796
XML Entity Expansion in Tika's SAXParsers after reset()
Slava Gorelik
?-1.19
CVE-2018-11797
Very long loop parsing page tree in PDFBox
Shawn Rasheed and Jens Dietrich
?-1.19
CVE-2018-11771
Infinite Loop in Commons-Compress ZipArchiveInputStream
Tobias Ospelt
?-1.18
CVE-2018-8017
Infinite Loop in IptcAnpaParser
Rohan Padhye and Tobias Ospelt
1.2-1.18
CVE-2018-8036
Infinite Loop leading to OOM in PDFBox's AFMParser
Tobias Ospelt
?-1.18
CVE-2018-12418
Infinite Loop in junrar
Tobias Ospelt
?-1.18
CVE-2018-11761
XML Entity Expansion Vulnerability
Renfei (Brian) Wang
0.1-1.18
CVE-2018-11762
Rare Zip Slip Vulnerability in tika-app
Tim Allison
0.9-1.18
RIFFReader
Infinite Loop in AudioParser in Java 8 and 9
Sergey Bylokhov and Tobias Ospelt
?-1.18
TIKA-2446
OOM detecting OPCPackage files with corrupt ZIP
Thorsten Schäfer
?-1.18
PDFBOX-4014
Infinite loop in JBig2 (versions less than 3.0.0)
Hanno Böck
(if user supplied) ?-1.17
CVE-2018-1339
Infinite loop in ChmParser
Tobias Ospelt
?-1.17
CVE-2018-1338
Infinite loop in BPGParser
Tobias Ospelt
?-1.17
CVE-2018-1335
Command Execution in tika-server
Tim Allison
?-1.17
CVE-2017-12626
Apache POI - Infinite loops in WMF, EMF, MSG and macros; OOMs in DOC, PPT and XLS
Tim Allison, Luís Filipe Nassif and Jerome Lacoste
?-1.17
CVE-2018-1324
and
COMPRESS-432
Commons Compress - Infinite loop in ZipFile
Luís Filipe Nassif and Anton Abashkin
?-1.17
CVE-2018-7489
and
TIKA-2634
Jackson - Deserialization vulnerability
Richard Cyganiak (notified Tika team)
?-1.17
PDFBOX-3919
Apache PDFBox - Infinite loop
Hanno Böck and Andreas Bogk
?-1.16
TIKA-2115
Apache POI - OOM parsing OLE object
Thomas Galla
?-1.15
COMPRESS-382
Commons Compress - OOM detecting corrupt LZMA
Luís Filipe Nassif
?-1.15
COMPRESS-386
and
TIKA-1631
Commons Compress - OOM detecting corrupt x-compress
Pavel Micka
?-1.15
TIKA-2045
and
TIKA-3442
Apache PDFBox - OOM in font caching
Egbert
?-1.13
TIKA-1866
and
TIKA-954
Apache POI - OOM in DOCX and PPTX because of bug in Piccolo parser
Rob Tulloh and Shawn Johnson
?-1.13
TIKA-2040
GC-Overload and OOM in CHMParser
Luís Filipe Nassif
?-1.13
CVE-2016-6809
jmatio - Deserialization Vulnerability in MATLAB parser
Pierre Ernst
1.6-1.13
CVE-2016-4434
XXE Vulnerability in several parsers
Arthur Khashaev, Seulgi Kim, Mesut Timur (and Tim Allison while remediating initial issue reported by Arthur et al.)
0.10-1.12
CVE-2016-2175
XML External Entity (XXE) in PDFBox
???
?-1.12
CVE-2015-3271
Remote Access to host files via tika-server
Tim Allison
1.9?-1.10
PDFBOX-2811
Apache PDFBox - Infinite Loop
Andreas Lehmkühler
?-1.10
PDFBOX-2200
Apache PDFBox - Slowly building memory leak because of static caching of fonts
Matthew Buckett
?-1.6
TIKA-1471
Apache PDFBox - OOM with corrupt PDF
Alan Burlison
?-1.6
TIKA-788
Infinite Loop in DWG
Stas Shaposhnikov
?-1.4?
TIKA-1132
Apache POI - Nearly Infinite Loop in XLS
Ryan Krueger
?-1.4
TIKA-1179
Infinite Loop in corrupt MP3
Marius Dumitru Florea
?-1.4
TIKA-866
OOM reading Tika config file
Stephan Mühlstrasser
?-1.1
Third party vulnerabilities that may or may not be triggerable via regular use of Apache Tika.
CVE or Vulnerability
Description
Reporter
Affected Versions
CVE-2018-10237
Unbounded memory allocation in Google Guava
Pat Cashman (notified Tika team)
?-1.20
CVE-2018-19362
FaxterXML jackson-databind may allow attackers to have unspecified impact from polymorphic deserialization
Pat Cashman (notified Tika team)
?-1.20
Acronyms and Terms
Command Execution -- A malicious client could execute anything on tika-server's commandline
Deserialization Vulnerability --
OWASP's Cheat Sheet
. A malicious actor could run arbitrary code on your computer.
OOM -- Out of Memory Error -- Parsers may allocate more memory than is available. This can sometimes be caused by parsers not performing sanity checks before allocation. See, for example:
TIKA-1631
XXE --
XML External Entity Processing
A malicious client could access data on your system.
Apache Tika
Introduction
Contribute
Mailing Lists
Tika Wiki
Tika Server Wiki
Issue Tracker
Security Model
Security
Tika Support
Documentation
Apache Tika 3.3.0
Getting Started
Supported Formats
Parser API
Parser 5min Quick Start Guide
Content and Language Detection
Configuring Tika
Usage Examples
API Documentation
Apache Tika 3.2.3
Apache Tika 3.2.2
Apache Tika 3.2.1
Apache Tika 3.2.0
Apache Tika 3.1.0
Apache Tika 3.0.0
Apache Tika 3.0.0-BETA2
Apache Tika 3.0.0-BETA
Apache Tika 2.9.4
Apache Tika 2.9.3
Apache Tika 2.9.2
Apache Tika 2.9.1
Apache Tika 2.9.0
Apache Tika 2.8.0
Apache Tika 2.7.0
Apache Tika 2.6.0
Apache Tika 2.5.0
Apache Tika 2.4.1
Apache Tika 2.4.0
Apache Tika 2.3.0
Apache Tika 2.2.1
Apache Tika 2.2.0
Apache Tika 2.1.0
Apache Tika 2.0.0
The Apache Software Foundation
About
License
Security
Sponsorship
Thanks
Books about Tika
Copyright © 2026
The Apache Software Foundation
Site powered by
Apache Maven
Apache Tika, Tika, Apache, the Apache feather logo, and the Apache
Tika project logo are trademarks of The Apache Software Foundation.