Apache Tika – Security Model
Security Model
Parsing is dangerous. Bad things can happen when parsing untrusted data. See, for example, our
Security page
, which documents fixed vulnerabilities in Tika and its dependencies. These vulnerabilities include, among others: denial of service, XML external entity injection/server-side request forgery, command injection and deserialization of untrusted objects.
Apache Tika is primarily designed to work with trusted/sanitized data. Users are responsible for handling crashes and other consequences from parsing untrusted data. See
the Robustness of Apache Tika
for guidance on how to run Tika more safely.
The project does not view denial of service issues as security issues. Nevertheless, we do appreciate reports and pull requests to harden the codebase against denial of service and all vulnerabilities.
Mime detection and content extraction are both inherently challenging tasks and prone to errors. We advise against trusting without verification either mime detection or content extraction in high risk applications such as, for example, cross-domain filtering or search.
Tika is not designed to identify or render safe files that are crafted to trigger direct vulnerabilities or to create parser differentials (such as with polyglots, chimeras, schizophrenic files or ...).
Files can be crafted to evade detection, hinder analysis or otherwise cause mayhem in countless ways.
Running
tika-server
adds its own security risks. Depending on the settings and the loaded modules (
tika-pipes
, for example), it is possible that a client could have read and write access at the same level as the user running the application. We strongly encourage defense in depth with tika-server, including, for example, isolating access to its endpoints, setting up two-way TLS, and limiting tika-server's user permissions among other standard security practices.
Apache Tika
Introduction
Contribute
Mailing Lists
Tika Wiki
Tika Server Wiki
Issue Tracker
Security Model
Security
Tika Support
Documentation
Apache Tika 3.3.0
Getting Started
Supported Formats
Parser API
Parser 5min Quick Start Guide
Content and Language Detection
Configuring Tika
Usage Examples
API Documentation
Apache Tika 3.2.3
Apache Tika 3.2.2
Apache Tika 3.2.1
Apache Tika 3.2.0
Apache Tika 3.1.0
Apache Tika 3.0.0
Apache Tika 3.0.0-BETA2
Apache Tika 3.0.0-BETA
Apache Tika 2.9.4
Apache Tika 2.9.3
Apache Tika 2.9.2
Apache Tika 2.9.1
Apache Tika 2.9.0
Apache Tika 2.8.0
Apache Tika 2.7.0
Apache Tika 2.6.0
Apache Tika 2.5.0
Apache Tika 2.4.1
Apache Tika 2.4.0
Apache Tika 2.3.0
Apache Tika 2.2.1
Apache Tika 2.2.0
Apache Tika 2.1.0
Apache Tika 2.0.0
The Apache Software Foundation
About
License
Security
Sponsorship
Thanks
Books about Tika
Copyright © 2026
The Apache Software Foundation
Site powered by
Apache Maven
Apache Tika, Tika, Apache, the Apache feather logo, and the Apache
Tika project logo are trademarks of The Apache Software Foundation.
US