Apache Hadoop 3.5.0 – Testing the Filesystem Contract
Wiki
git
Apache Hadoop
| Last Published: 2026-03-24
| Version: 3.5.0
General
Overview
Single Node Setup
Cluster Setup
Commands Reference
FileSystem Shell
Compatibility Specification
Downstream Developer's Guide
Admin Compatibility Guide
Interface Classification
FileSystem Specification
Common
CLI Mini Cluster
Fair Call Queue
Native Libraries
Proxy User
Rack Awareness
Secure Mode
Service Level Authorization
HTTP Authentication
Credential Provider API
Hadoop KMS
Tracing
Unix Shell Guide
Registry
Async Profiler
HDFS
Architecture
User Guide
Commands Reference
NameNode HA With QJM
NameNode HA With NFS
Observer NameNode
Federation
ViewFs
ViewFsOverloadScheme
Snapshots
Edits Viewer
Image Viewer
Permissions and HDFS
Quotas and HDFS
libhdfs (C API)
WebHDFS (REST API)
HttpFS
Short Circuit Local Reads
Centralized Cache Management
NFS Gateway
Rolling Upgrade
Extended Attributes
Transparent Encryption
Multihoming
Storage Policies
Memory Storage Support
Synthetic Load Generator
Erasure Coding
Disk Balancer
Upgrade Domain
DataNode Admin
Router Federation
Provided Storage
MapReduce
Tutorial
Commands Reference
Encrypted Shuffle
Pluggable Shuffle/Sort
Distributed Cache Deploy
Support for YARN Shared Cache
MapReduce REST APIs
MR Application Master
MR History Server
YARN
Architecture
Commands Reference
Capacity Scheduler
Fair Scheduler
ResourceManager Restart
ResourceManager HA
Resource Model
Node Labels
Node Attributes
Web Application Proxy
Timeline Server
Timeline Service V.2
Writing YARN Applications
YARN Application Security
NodeManager
Running Applications in Docker Containers
Running Applications in runC Containers
Using CGroups
Secure Containers
Reservation System
Graceful Decommission
Opportunistic Containers
YARN Federation
Shared Cache
Using GPU
Using FPGA
Placement Constraints
YARN UI2
YARN REST APIs
Introduction
Resource Manager
Node Manager
Timeline Server
Timeline Service V.2
YARN Service
Overview
QuickStart
Concepts
Yarn Service API
Service Discovery
System Services
Hadoop Compatible File Systems
Aliyun OSS
Amazon S3
Azure Blob Storage
Azure Data Lake Storage
Tencent COS
Huaweicloud OBS
VolcanoEngine TOS
Auth
Overview
Examples
Configuration
Building
Tools
Hadoop Streaming
Hadoop Archives
Hadoop Archive Logs
DistCp
HDFS Federation Balance
GridMix
Rumen
Resource Estimator Service
Scheduler Load Simulator
Hadoop Benchmarking
Dynamometer
Reference
Changelog and Release Notes
Java API docs
Unix Shell API
Metrics
Configuration
core-default.xml
hdfs-default.xml
hdfs-rbf-default.xml
mapred-default.xml
yarn-default.xml
kms-default.xml
httpfs-default.xml
Deprecated Properties
Testing the Filesystem Contract
Running the tests
ftp://
Testing a new filesystem
Handling test failures
‘Lax vs Strict’ exceptions
Supporting FileSystems with login and authentication parameters
Important: passing the tests does not guarantee compatibility
Adding a new test suite
Root manipulation tests
Scalability tests
Extending the specification
Running the tests
A normal Hadoop test run will test those FileSystems that can be tested locally via the local filesystem. This typically means
file://
and its underlying
LocalFileSystem
, and
hdfs://
via the HDFS MiniCluster.
Other filesystems are skipped unless there is a specific configuration to the remote server providing the filesystem.
These filesystem bindings must be defined in an XML configuration file, usually
hadoop-common-project/hadoop-common/src/test/resources/contract-test-options.xml
. This file is excluded and should not be checked in.
ftp://
In
contract-test-options.xml
, the filesystem name must be defined in the property
fs.contract.test.fs.ftp
. The specific login options to connect to the FTP Server must then be provided.
A path to a test directory must also be provided in the option
fs.contract.test.ftp.testdir
. This is the directory under which operations take place.
Example:


fs.contract.test.fs.ftp
ftp://server1/


fs.ftp.user.server1
testuser


fs.contract.test.ftp.testdir
/home/testuser/test


fs.ftp.password.server1
secret-login


Testing a new filesystem
The core of adding a new FileSystem to the contract tests is adding a new contract class, then creating a new non-abstract test class for every test suite that you wish to test.
Do not try and add these tests into Hadoop itself. They won’t be added to the source tree. The tests must live with your own filesystem source.
Create a package in your own test source tree (usually) under
contract
, for the files and tests.
Subclass
AbstractFSContract
for your own contract implementation.
For every test suite you plan to support create a non-abstract subclass, with the name starting with
Test
and the name of the filesystem. Example:
TestHDFSRenameContract
These non-abstract classes must implement the abstract method
createContract()
Identify and document any filesystem bindings that must be defined in a
src/test/resources/contract-test-options.xml
file of the specific project.
Run the tests until they work.
As an example, here is the implementation of the test of the
create()
tests for the local filesystem.
package org.apache.hadoop.fs.contract.localfs;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.contract.AbstractCreateContractTest;
import org.apache.hadoop.fs.contract.AbstractFSContract;

public class TestLocalCreateContract extends AbstractCreateContractTest {
@Override
protected AbstractFSContract createContract(Configuration conf) {
return new LocalFSContract(conf);
The standard implementation technique for subclasses of
AbstractFSContract
is to be driven entirely by a Hadoop XML configuration file stored in the test resource tree. The best practise is to store it under
/contract
with the name of the FileSystem, such as
contract/localfs.xml
. Having the XML file define all FileSystem options makes the listing of FileSystem behaviors immediately visible.
The
LocalFSContract
is a special case of this, as it must adjust its case sensitivity policy based on the OS on which it is running: for both Windows and OS/X, the filesystem is case insensitive, so the
ContractOptions.IS_CASE_SENSITIVE
option must be set to false. Furthermore, the Windows filesystem does not support Unix file and directory permissions, so the relevant flag must also be set. This is done
after
loading the XML contract file from the resource tree, simply by updating the now-loaded configuration options:
getConf().setBoolean(getConfKey(ContractOptions.SUPPORTS_UNIX_PERMISSIONS), false);
Handling test failures
If your new
FileSystem
test cases fails one of the contract tests, what you can you do?
It depends on the cause of the problem
Case: custom
FileSystem
subclass class doesn’t correctly implement specification. Fix.
Case: Underlying filesystem doesn’t behave in a way that matches Hadoop’s expectations. Ideally, fix. Or try to make your
FileSystem
subclass hide the differences, e.g. by translating exceptions.
Case: fundamental architectural differences between your filesystem and Hadoop. Example: different concurrency and consistency model. Recommendation: document and make clear that the filesystem is not compatible with HDFS.
Case: test does not match the specification. Fix: patch test, submit the patch to Hadoop.
Case: specification incorrect. The underlying specification is (with a few exceptions) HDFS. If the specification does not match HDFS, HDFS should normally be assumed to be the real definition of what a FileSystem should do. If there’s a mismatch, please raise it on the
hdfs-dev
mailing list. Note that while FileSystem tests live in the core Hadoop codebase, it is the HDFS team who owns the FileSystem specification and the tests that accompany it.
If a test needs to be skipped because a feature is not supported, look for a existing configuration option in the
ContractOptions
class. If there is no method, the short term fix is to override the method and use the
ContractTestUtils.skip()
message to log the fact that a test is skipped. Using this method prints the message to the logs, then tells the test runner that the test was skipped. This highlights the problem.
A recommended strategy is to call the superclass, catch the exception, and verify that the exception class and part of the error string matches that raised by the current implementation. It should also
fail()
if superclass actually succeeded -that is it failed the way that the implemention does not currently do. This will ensure that the test path is still executed, any other failure of the test -possibly a regression- is picked up. And, if the feature does become implemented, that the change is picked up.
A long-term solution is to enhance the base test to add a new optional feature key. This will require collaboration with the developers on the
hdfs-dev
mailing list.
‘Lax vs Strict’ exceptions
The contract tests include the notion of strict vs lax exceptions.
Strict
exception reporting means: reports failures using specific subclasses of
IOException
, such as
FileNotFoundException
EOFException
and so on.
Lax
reporting means throws
IOException
While FileSystems SHOULD raise stricter exceptions, there may be reasons why they cannot. Raising lax exceptions is still allowed, it merely hampers diagnostics of failures in user applications. To declare that a FileSystem does not support the stricter exceptions, set the option
fs.contract.supports-strict-exceptions
to false.
Supporting FileSystems with login and authentication parameters
Tests against remote FileSystems will require the URL to the FileSystem to be specified; tests against remote FileSystems that require login details require usernames/IDs and passwords.
All these details MUST be required to be placed in the file
src/test/resources/contract-test-options.xml
, and your SCM tools configured to never commit this file to subversion, git or equivalent. Furthermore, the build MUST be configured to never bundle this file in any
-test
artifacts generated. The Hadoop build does this, excluding
src/test/**/*.xml
from the JAR files. In addition,
src/test/resources/auth-keys.xml
will need to be created. It can be a copy of
contract-test-options.xml
. The
AbstractFSContract
class automatically loads this resource file if present; specific keys for specific test cases can be added.
As an example, here are what S3A test keys look like:


fs.contract.test.fs.s3a
s3a://tests3contract


fs.s3a.access.key
DONOTPCOMMITTHISKEYTOSCM


fs.s3a.secret.key
DONOTEVERSHARETHISSECRETKEY!


The
AbstractBondedFSContract
automatically skips a test suite if the FileSystem URL is not defined in the property
fs.contract.test.fs.%s
, where
%s
matches the schema name of the FileSystem.
When running the tests
maven.test.skip
will need to be turned off since it is true by default on these tests. This can be done with a command like
mvn test -Ptests-on
Important: passing the tests does not guarantee compatibility
Passing all the FileSystem contract tests does not mean that a filesystem can be described as “compatible with HDFS”. The tests try to look at the isolated functionality of each operation, and focus on the preconditions and postconditions of each action. Core areas not covered are concurrency and aspects of failure across a distributed system.
Consistency: are all changes immediately visible?
Atomicity: are operations which HDFS guarantees to be atomic equally so on the new filesystem.
Idempotency: if the filesystem implements any retry policy, is idempotent even while other clients manipulate the filesystem?
Scalability: does it support files as large as HDFS, or as many in a single directory?
Durability: do files actually last -and how long for?
There are also some specific aspects of the use of the FileSystem API:
Compatibility with the
hadoop -fs
CLI.
Whether the blocksize policy produces file splits that are suitable for analytics workss. (as an example, a blocksize of 1 matches the specification, but as it tells MapReduce jobs to work a byte at a time, unusable).
Tests that verify these behaviors are of course welcome.
Adding a new test suite
New tests should be split up with a test class per operation, as is done for
seek()
rename()
create()
, and so on. This is to match up the way that the FileSystem contract specification is split up by operation. It also makes it easier for FileSystem implementors to work on one test suite at a time.
Subclass
AbstractFSContractTestBase
with a new abstract test suite class. Again, use
Abstract
in the title.
Look at
org.apache.hadoop.fs.contract.ContractTestUtils
for utility classes to aid testing, with lots of filesystem-centric assertions. Use these to make assertions about the filesystem state, and to incude diagnostics information such as directory listings and dumps of mismatched files when an assertion actually fails.
Write tests for the local, raw local and HDFS filesystems -if one of these fails the tests then there is a sign of a problem -though be aware that they do have differnces
Test on the object stores once the core filesystems are passing the tests.
Try and log failures with as much detail as you can -the people debugging the failures will appreciate it.
Root manipulation tests
Some tests work directly against the root filesystem, attempting to do things like rename “/” and similar actions. The root directory is “special”, and it’s important to test this, especially on non-POSIX filesystems such as object stores. These tests are potentially very destructive to native filesystems, so use care.
Add the tests under
AbstractRootDirectoryContractTest
or create a new test with (a)
Root
in the title and (b) a check in the setup method to skip the test if root tests are disabled:
skipIfUnsupported(TEST_ROOT_TESTS_ENABLED);
Don’t provide an implementation of this test suite to run against the local FS.
Scalability tests
Tests designed to generate scalable load -and that includes a large number of small files, as well as fewer larger files, should be designed to be configurable, so that users of the test suite can configure the number and size of files.
Be aware that on object stores, the directory rename operation is usually
O(files)*O(data)
while the delete operation is
O(files)
. The latter means even any directory cleanup operations may take time and can potentially timeout. It is important to design tests that work against remote filesystems with possible delays in all operations.
Extending the specification
The specification is incomplete. It doesn’t have complete coverage of the FileSystem classes, and there may be bits of the existing specified classes that are not covered.
Look at the implementations of a class/interface/method to see what they do, especially HDFS and local. These are the documentation of what is done today.
Look at the POSIX API specification.
Search through the HDFS JIRAs for discussions on FileSystem topics, and try to understand what was meant to happen, as well as what does happen.
Use an IDE to find out how methods are used in Hadoop, HBase and other parts of the stack. Although this assumes that these are representative Hadoop applications, it will at least show how applications
expect
a FileSystem to behave.
Look in the java.io source to see how the bunded FileSystem classes are expected to behave -and read their javadocs carefully.
If something is unclear -as on the hdfs-dev list.
Don’t be afraid to write tests to act as experiments and clarify what actually happens. Use the HDFS behaviours as the normative guide.
Apache Software Foundation

Apache Maven, Maven, Apache, the Apache feather logo, and the Apache Maven project logos are trademarks of The Apache Software Foundation.