1.
Introduction
This section is informative.
Web applications should have the ability to manipulate as wide as possible a range of user input,
including files that a user may wish to upload to a remote server or manipulate inside a rich web application.
This specification defines the basic representations for files,
lists of files,
errors raised by access to files,
and programmatic ways to read files.
Additionally, this specification also defines an interface that represents "raw data"
which can be asynchronously processed on the main thread of conforming user agents.
The interfaces and API defined in this specification can be used with other interfaces and APIs exposed to the web platform.
The
File
interface represents file data typically obtained from the underlying file system,
and the
Blob
interface
("Binary Large Object" - a name originally introduced to web APIs in
Google Gears
represents immutable raw data.
File
or
Blob
reads should happen asynchronously on the main thread,
with an optional synchronous API used within threaded web applications.
An asynchronous API for reading files prevents blocking and UI "freezing" on a user agent’s main thread.
This specification defines an asynchronous API based on an
event model
to read and access a
File
or
Blob
’s data.
FileReader
object provides asynchronous read methods to access that file’s data
through event handler content attributes and the firing of events.
The use of events and event handlers allows separate code blocks the ability
to monitor the
progress of the read
(which is particularly useful for remote drives or mounted drives,
where file access performance may vary from local drives)
and error conditions that may arise during reading of a file.
An example will be illustrative.
In the example below, different code blocks handle progress, error, and success conditions.
function
startRead
()
// obtain input element through DOM
var
file
document
getElementById
'file'
).
files
];
if
file
){
getAsText
file
);
function
getAsText
readFile
var
reader
new
FileReader
();
// Read file into memory as UTF-16
reader
readAsText
readFile
"UTF-16"
);
// Handle progress, success, and errors
reader
onprogress
updateProgress
reader
onload
loaded
reader
onerror
errorHandler
function
updateProgress
evt
if
evt
lengthComputable
// evt.loaded and evt.total are ProgressEvent properties
var
loaded
evt
loaded
evt
total
);
if
loaded
// Increase the prog bar length
// style.width = (loaded * 200) + "px";
function
loaded
evt
// Obtain the read file data
var
fileString
evt
target
result
// Handle UTF-16 file dump
if
utils
regexp
isChinese
fileString
))
//Chinese Characters + Name validation
else
// run other charset test
// xhr.send(fileString)
function
errorHandler
evt
if
evt
target
error
name
==
"NotReadableError"
// The file could not be read
2.
Terminology and Algorithms
When this specification says to
terminate an algorithm
the user agent must terminate the algorithm after finishing the step it is on.
Asynchronous
read methods
defined in this specification may return before the algorithm in question is terminated,
and can be terminated by an
abort()
call.
The algorithms and steps in this specification use the following mathematical operations:
max(a,b) returns the maximum of a and b,
and is always performed on integers as they are defined in WebIDL
[WebIDL]
in the case of max(6,4) the result is 6.
This operation is also defined in ECMAScript
[ECMA-262]
min(a,b) returns the minimum of a and b,
and is always performed on integers as they are defined in WebIDL
[WebIDL]
in the case of min(6,4) the result is 4.
This operation is also defined in ECMAScript
[ECMA-262]
Mathematical comparisons such as < (less than), ≤ (less than or equal to), and > (greater than) are as in ECMAScript
[ECMA-262]
The term
Unix Epoch
is used in this specification to refer to the time 00:00:00 UTC on January 1 1970
(or 1970-01-01T00:00:00Z ISO 8601);
this is the same time that is conceptually "
" in ECMA-262
[ECMA-262]
The
slice blob
algorithm given a
Blob
blob
start
end
, and
contentType
is used to refer to the following
steps and returns a new
Blob
containing the bytes ranging from the
start
parameter
up to but not including the
end
parameter. It must act as follows:
Let
originalSize
be
blob
’s
size
The
start
parameter, if non-null, is a value for the start point of a
slice blob
call, and must be treated as a byte-order position, with the zeroth position representing the
first byte. User agents must normalize
start
according to the following:
If
start
is null, let
relativeStart
be 0.
If
start
is negative, let
relativeStart
be
max((
originalSize
start
), 0)
Otherwise, let
relativeStart
be
min(
start
originalSize
The
end
parameter, if non-null. is a value for the end point of a
slice blob
call. User agents must normalize
end
according to the following:
If
end
is null, let
relativeEnd
be
originalSize
If
end
is negative, let
relativeEnd
be
max((
originalSize
end
), 0)
Otherwise, let
relativeEnd
be
min(
end
originalSize
The
contentType
parameter, if non-null, is used to set the ASCII-encoded string in
lower case representing the media type of the
Blob
. User agents must normalize
contentType
according to the following:
If
contentType
is null, let
relativeContentType
be set to the empty
string.
Otherwise, let
relativeContentType
be set to
contentType
and run the
substeps below:
If
relativeContentType
contains any characters outside the range of U+0020 to
U+007E, then set
relativeContentType
to the empty string and return from these
substeps.
Convert every character in
relativeContentType
to
ASCII lowercase
Let
span
be
max((
relativeEnd
relativeStart
), 0)
Return a new
Blob
object
with the following characteristics:
refers to
span
consecutive
byte
s from
blob
’s
associated
byte
sequence, beginning with the
byte
at byte-order position
relativeStart
size
span
type
relativeContentType
3.
The Blob Interface and Binary Data
Blob
object refers to a
byte
sequence,
and has a
size
attribute which is the total number of bytes in the byte sequence,
and a
type
attribute,
which is an ASCII-encoded string in lower case representing the media type of the
byte
sequence.
Each
Blob
must have an internal
snapshot state
which must be initially set to the state of the underlying storage,
if any such underlying storage exists.
Further normative definition of
snapshot state
can be found for
File
s.
Exposed
=(
Window
Worker
),
Serializable
interface
Blob
constructor
optional
sequence
BlobPart
blobParts
optional
BlobPropertyBag
options
= {});
readonly
attribute
unsigned
long
long
size
readonly
attribute
DOMString
type

// slice Blob into byte-ranged chunks
Blob
slice
optional
Clamp
long
long
start
optional
Clamp
long
long
end
optional
DOMString
contentType
);

// read from the Blob.
NewObject
ReadableStream
stream
();
NewObject
Promise
USVString
text
();
NewObject
Promise
ArrayBuffer
arrayBuffer
();
NewObject
Promise
Uint8Array
bytes
();
};
enum
EndingType
"transparent"
"native"
};
dictionary
BlobPropertyBag
DOMString
type
= "";
EndingType
endings
= "transparent";
};
typedef
BufferSource
or
Blob
or
USVString
BlobPart
Blob
objects are
serializable objects
. Their
serialization steps
given
value
and
serialized
, are:
Set
serialized
.[[SnapshotState]] to
value
’s
snapshot state
Set
serialized
.[[ByteSequence]] to
value
’s underlying byte sequence.
Their
deserialization step
, given
serialized
and
value
, are:
Set
value
’s
snapshot state
to
serialized
.[[SnapshotState]].
Set
value
’s underlying byte sequence to
serialized
.[[ByteSequence]].
3.1.
Constructors
The
Blob()
constructor can be invoked with zero or more parameters.
When the
Blob()
constructor is invoked,
user agents must run the following steps:
If invoked with zero parameters,
return a new
Blob
object consisting of 0 bytes,
with
size
set to 0,
and with
type
set to the empty string.
Let
bytes
be the result of
processing blob parts
given
blobParts
and
options
If the
type
member of the
options
argument is not the empty string,
run the following sub-steps:
Let
be the
type
dictionary member.
If
contains any characters outside the range U+0020 to U+007E,
then set
to the empty string and return from these substeps.
Convert every character in
to
ASCII lowercase
Return a
Blob
object referring to
bytes
as its associated
byte
sequence,
with its
size
set to the length of
bytes
and its
type
set to the value of
from the substeps above.
3.1.1.
Constructor Parameters
The
Blob()
constructor can be invoked with the parameters below:
blobParts
sequence
which takes any number of the following types of elements, and in any order:
An
optional
BlobPropertyBag
which takes these optional members:
To
process blob parts
given a sequence of
BlobPart
’s
parts
and
BlobPropertyBag
options
run the following steps:
Let
bytes
be an empty sequence of bytes.
For each
element
in
parts
If
element
is a
USVString
, run the following substeps:
Let
be
element
If the
endings
member of
options
is
"native"
set
to the result of
converting line endings to native
of
element
Append the result of
UTF-8 encoding
to
bytes
Note:
The algorithm from WebIDL
[WebIDL]
replaces unmatched surrogates in an invalid utf-16 string
with U+FFFD replacement characters.
Scenarios exist when the
Blob
constructor may result in some data loss
due to lost or scrambled character sequences.
If
element
is a
BufferSource
get
a copy of the bytes held by the buffer source
, and append those bytes to
bytes
If
element
is a
Blob
append the bytes it represents to
bytes
Note:
The
type
of the
Blob
array element is ignored and will not affect
type
of returned
Blob
object.
Return
bytes
To
convert line endings to native
in a
string
run the following steps:
Let
native line ending
be the
code point
U+000A LF.
If the underlying platform’s conventions are
to represent newlines as a carriage return and line feed sequence,
set
native line ending
to the
code point
U+000D CR
followed by the
code point
U+000A LF.
Set
result
to the empty
string
Let
position
be a
position variable
for
initially pointing at the start of
Let
token
be the result of
collecting a sequence of code points
that are not equal to U+000A LF or U+000D CR
from
given
position
Append
token
to
result
While
position
is not past the end of
If the
code point
at
position
within
equals U+000D CR:
Append
native line ending
to
result
Advance
position
by 1.
If
position
is not past the end of
and the
code point
at
position
within
equals U+000A LF
advance
position
by 1.
Otherwise if the
code point
at
position
within
equals U+000A LF,
advance
position
by 1 and append
native line ending
to
result
Let
token
be the result of
collecting a sequence of code points
that are not equal to U+000A LF or U+000D CR
from
given
position
Append
token
to
result
Return
result
Examples of constructor usage follow.
// Create a new Blob object
var
new
Blob
();
// Create a 1024-byte ArrayBuffer
// buffer could also come from reading a File
var
buffer
new
ArrayBuffer
1024
);
// Create ArrayBufferView objects based on buffer
var
shorts
new
Uint16Array
buffer
512
128
);
var
bytes
new
Uint8Array
buffer
shorts
byteOffset
shorts
byteLength
);
var
new
Blob
([
"foobarbazetcetc"
"birdiebirdieboo"
],
type
"text/plain;charset=utf-8"
});
var
new
Blob
([
shorts
]);
var
new
Blob
([
bytes
]);
var
new
Blob
([
buffer
bytes
]);
3.2.
Attributes
size
of type
unsigned long long
, readonly
Returns the size of the
byte
sequence in number of bytes.
On getting, conforming user agents must return the total number of bytes that can be read by a
FileReader
or
FileReaderSync
object,
or 0 if the
Blob
has no bytes to be read.
type
of type
DOMString
, readonly
The ASCII-encoded string in lower case representing the media type of the
Blob
On getting, user agents must return the type of a
Blob
as an ASCII-encoded string in lower case,
such that when it is converted to a
byte
sequence,
it is a
parsable MIME type
or the empty string – 0 bytes – if the type cannot be determined.
The
type
attribute can be set by the web application itself through constructor invocation
and through the
slice()
call;
in these cases, further normative conditions for this attribute are in
§ 3.1 Constructors
§ 4.1 Constructor
and
§ 3.3.1 The slice() method
respectively.
User agents can also determine the
type
of a
Blob
especially if the
byte
sequence is from an on-disk file;
in this case, further normative conditions are in the
file type guidelines
Note:
The type
of a
Blob
is considered a
parsable MIME type
if performing the
parse a MIME type
algorithm to a byte sequence converted from
the ASCII-encoded string representing the Blob object’s type does not return failure.
Note:
Use of the
type
attribute informs the
package data
algorithm
and determines the
Content-Type
header when
fetching
blob URLs
3.3.
Methods and Parameters
3.3.1.
The
slice()
method
The
slice()
method
returns a new
Blob
object with bytes ranging from the optional
start
parameter
up to but not including the optional
end
parameter, and with a
type
attribute
that is the value of the optional
contentType
parameter. It must act as follows:
Let
sliceStart
sliceEnd
, and
sliceContentType
be null.
If
start
is given, set
sliceStart
to
start
If
end
is given, set
sliceEnd
to
end
If
contentType
is given, set
sliceContentType
to
contentType
Return the result of
slice blob
given
this
sliceStart
sliceEnd
, and
sliceContentType
The examples below illustrate the different types of
slice()
calls possible. Since the
File
interface inherits from the
Blob
interface, examples are based on the use of the
File
interface.
// obtain input element through DOM
var
file
document
getElementById
'file'
).
files
];
if
file
// create an identical copy of file
// the two calls below are equivalent
var
fileClone
file
slice
();
var
fileClone2
file
slice
file
size
);
// slice file into 1/2 chunk starting at middle of file
// Note the use of negative number
var
fileChunkFromEnd
file
slice
Math
round
file
size
)));
// slice file into 1/2 chunk starting at beginning of file
var
fileChunkFromStart
file
slice
Math
round
file
size
));
// slice file from beginning till 150 bytes before end
var
fileNoMetadata
file
slice
150
"application/experimental"
);
3.3.2.
The
stream()
method
The
stream()
method, when invoked, must return
the result of calling
get stream
on
this
3.3.3.
The
text()
method
The
text()
method, when invoked, must run these steps:
Let
stream
be the result of calling
get stream
on
this
Let
reader
be the result of
getting a reader
from
stream
If that threw an exception, return a new promise rejected with that exception.
Let
promise
be the result of
reading all bytes
from
stream
with
reader
Return the result of transforming
promise
by a fulfillment handler that returns the result of
running
UTF-8 decode
on its first argument.
Note:
This is different from the behavior of
readAsText()
to align better
with the behavior of
Fetch’s text()
. Specifically this method will always
use UTF-8 as encoding, while
FileReader
can use a different encoding depending on
the blob’s type and passed in encoding name.
The
arrayBuffer()
method, when invoked, must run these steps:
Let
stream
be the result of calling
get stream
on
this
Let
reader
be the result of
getting a reader
from
stream
If that threw an exception, return a new promise rejected with that exception.
Let
promise
be the result of
reading all bytes
from
stream
with
reader
Return the result of transforming
promise
by a fulfillment handler that returns
a new
ArrayBuffer
whose contents are its first argument.
3.3.5.
The
bytes()
method
The
bytes()
method, when invoked, must run these steps:
Let
stream
be the result of calling
get stream
on
this
Let
reader
be the result of
getting a reader
from
stream
If that threw an exception, return a new promise rejected with that exception.
Let
promise
be the result of
reading all bytes
from
stream
with
reader
Return the result of transforming
promise
by a fulfillment handler that returns
a new
Uint8Array
wrapping an
ArrayBuffer
containing its first argument.
4.
The File Interface
File
object is a
Blob
object with a
name
attribute, which is a string;
it can be created within the web application via a constructor,
or is a reference to a
byte
sequence from a file from the underlying (OS) file system.
If a
File
object is a reference to a
byte
sequence originating from a file on disk,
then its
snapshot state
should be set to the state of the file on disk at the time the
File
object is created.
Note:
This is a non-trivial requirement to implement for user agents,
and is thus not a
must
but a
should
[RFC2119]
User agents should endeavor to have a
File
object’s
snapshot state
set to the state of the underlying storage on disk at the time the reference is taken.
If the file is modified on disk following the time a reference has been taken,
the
File
’s
snapshot state
will differ from the state of the underlying storage.
User agents may use modification time stamps and other mechanisms to maintain
snapshot state
but this is left as an implementation detail.
When a
File
object refers to a file on disk,
user agents must return the
type
of that file,
and must follow the
file type guidelines
below:
User agents must return the
type
as an ASCII-encoded string in lower case,
such that when it is converted to a corresponding byte sequence,
it is a
parsable MIME type
or the empty string – 0 bytes – if the type cannot be determined.
When the file is of type
text/plain
user agents must NOT append a charset parameter to the
dictionary of parameters
portion of the media type
[MIMESNIFF]
User agents must not attempt heuristic determination of encoding,
including statistical methods.
Exposed
=(
Window
Worker
),
Serializable
interface
File
Blob
constructor
sequence
BlobPart
fileBits
USVString
fileName
optional
FilePropertyBag
options
= {});
readonly
attribute
DOMString
name
readonly
attribute
long
long
lastModified
};
dictionary
FilePropertyBag
BlobPropertyBag
long
long
lastModified
};
File
objects are
serializable objects
. Their
serialization steps
given
value
and
serialized
, are:
Set
serialized
.[[SnapshotState]] to
value
’s
snapshot state
Set
serialized
.[[ByteSequence]] to
value
’s underlying byte sequence.
Set
serialized
.[[Name]] to the value of
value
’s
name
attribute.
Set
serialized
.[[LastModified]] to the value of
value
’s
lastModified
attribute.
Their
deserialization steps
, given
value
and
serialized
, are:
Set
value
’s
snapshot state
to
serialized
.[[SnapshotState]].
Set
value
’s underlying byte sequence to
serialized
.[[ByteSequence]].
Initialize the value of
value
’s
name
attribute to
serialized
.[[Name]].
Initialize the value of
value
’s
lastModified
attribute to
serialized
.[[LastModified]].
4.1.
Constructor
The
File
constructor is invoked with two or three parameters,
depending on whether the optional dictionary parameter is used.
When the
File()
constructor is invoked,
user agents must run the following steps:
Let
bytes
be the result of
processing blob parts
given
fileBits
and
options
Let
be the
fileName
argument to the constructor.
Note:
Underlying OS filesystems use differing conventions for file name;
with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to
byte
sequences.
Process
FilePropertyBag
dictionary argument by running the following substeps:
If the
type
member is provided and is not the empty string,
let
be set to the
type
dictionary member.
If
contains any characters outside the range U+0020 to U+007E,
then set
to the empty string and return from these substeps.
Convert every character in
to
ASCII lowercase
If the
lastModified
member is provided,
let
be set to the
lastModified
dictionary member.
If it is not provided,
set
to the current date and time
represented as the number of milliseconds since the
Unix Epoch
(which is the equivalent of
Date.now()
[ECMA-262]
).
Note:
Since ECMA-262
Date
objects convert to
long long
values
representing the number of milliseconds since the
Unix Epoch
the
lastModified
member could be a
Date
object
[ECMA-262]
Return a new
File
object
such that:
refers to the
bytes
byte
sequence.
size
is set to the number of total bytes in
bytes
name
is set to
type
is set to
lastModified
is set to
4.1.1.
Constructor Parameters
The
File()
constructor can be invoked with the parameters below:
fileBits
sequence
which takes any number of the following elements, and in any order:
fileName
parameter
USVString
parameter representing the name of the file;
normative conditions for this constructor parameter can be found in
§ 4.1 Constructor
An optional
FilePropertyBag
dictionary
which in addition to the
members
of
BlobPropertyBag
takes one member:
4.2.
Attributes
name
of type
DOMString
, readonly
The name of the file.
On getting, this must return the name of the file as a string.
There are numerous file name variations and conventions used by different underlying OS file systems;
this is merely the name of the file, without path information.
On getting, if user agents cannot make this information available,
they must return the empty string.
If a
File
object is created using a constructor,
further normative conditions for this attribute are found in
§ 4.1 Constructor
lastModified
of type
long long
, readonly
The last modified date of the file.
On getting, if user agents can make this information available,
this must return a
long long
set to the time the file was last modified
as the number of milliseconds since the
Unix Epoch
If the last modification date and time are not known,
the attribute must return the current date and time
as a
long long
representing the number of milliseconds since the
Unix Epoch
this is equivalent to
Date
now
()
[ECMA-262]
If a
File
object is created using a constructor,
further normative conditions for this attribute are found in
§ 4.1 Constructor
The
File
interface is available on objects that expose an attribute of type
FileList
these objects are defined in HTML
[HTML]
The
File
interface, which inherits from
Blob
, is immutable,
and thus represents file data that can be read into memory at the time a
read operation
is initiated.
User agents must process reads on files that no longer exist at the time of read as
errors
throwing a
NotFoundError
exception
if using a
FileReaderSync
on a Web Worker
[Workers]
or firing an
error
event
with the
error
attribute returning a
NotFoundError
In the examples below, metadata from a file object is displayed meaningfully, and a file object is created with a name and a last modified date.
var
file
document
getElementById
"filePicker"
).
files
];
var
date
new
Date
file
lastModified
);
println
"You selected the file "
file
name
" which was modified on "
date
toDateString
()
"."
);
...
// Generate a file with a specific last modified date
var
new
Date
2013
12
16
23
45
600
);
var
generatedFile
new
File
([
"Rough Draft ...."
],
"Draft1.txt"
type
"text/plain"
lastModified
})
...
5.
The FileList Interface
Note:
The
FileList
interface should be considered "at risk"
since the general trend on the Web Platform is to replace such interfaces
with the
Array
platform object in ECMAScript
[ECMA-262]
In particular, this means syntax of the sort
filelist
item
is at risk;
most other programmatic use of
FileList
is unlikely to be affected by the eventual migration to an
Array
type.
This interface is a list of
File
objects.
Exposed
=(
Window
Worker
),
Serializable
interface
FileList
getter
File
item
unsigned
long
index
);
readonly
attribute
unsigned
long
length
};
FileList
objects are
serializable objects
. Their
serialization steps
given
value
and
serialized
, are:
Set
serialized
.[[Files]] to an empty
list
For each
file
in
value
, append the
sub-serialization
of
file
to
serialized
.[[Files]].
Their
deserialization step
, given
serialized
and
value
, are:
For each
file
of
serialized
.[[Files]], add the
sub-deserialization
of
file
to
value
Sample usage typically involves DOM access to the

element within a form,
and then accessing selected files.
// uploadData is a form element
// fileChooser is input element of type 'file'
var
file
document
forms
'uploadData'
][
'fileChooser'
].
files
];
// alternative syntax can be
// var file = document.forms['uploadData']['fileChooser'].files.item(0);
if
file
// Perform file ops
5.1.
Attributes
length
of type
unsigned long
, readonly
must return the number of files in the
FileList
object.
If there are no files, this attribute must return 0.
5.2.
Methods and Parameters
item(index)
must return the
index
th
File
object in the
FileList
If there is no
index
th
File
object in the
FileList
then this method must return
null
index
must be treated by user agents
as value for the position of a
File
object in the
FileList
with 0 representing the first file.
Supported property indices
are the numbers in the range zero
to one less than the number of
File
objects represented by the
FileList
object.
If there are no such
File
objects,
then there are no supported property indices.
Note:
The
HTMLInputElement
interface has a readonly attribute of type
FileList
which is what is being accessed in the above example.
Other interfaces with a readonly attribute of type
FileList
include the
DataTransfer
interface.
6.
Reading Data
6.1.
The File Reading Task Source
This specification defines a new generic
task source
called the
file reading task source
which is used for all
tasks that are queued
in this specification
to read byte sequences associated with
Blob
and
File
objects.
It is to be used for features that trigger in response to asynchronously reading binary data.
Exposed
=(
Window
Worker
)]
interface
FileReader
EventTarget
constructor
();
// async read methods
undefined
readAsArrayBuffer
Blob
blob
);
undefined
readAsBinaryString
Blob
blob
);
undefined
readAsText
Blob
blob
optional
DOMString
encoding
);
undefined
readAsDataURL
Blob
blob
);
undefined
abort
();

// states
const
unsigned
short
EMPTY
= 0;
const
unsigned
short
LOADING
= 1;
const
unsigned
short
DONE
= 2;
readonly
attribute
unsigned
short
readyState

// File or Blob data
readonly
attribute
DOMString
or
ArrayBuffer
)?
result
readonly
attribute
DOMException
error

// event handler content attributes
attribute
EventHandler
onloadstart
attribute
EventHandler
onprogress
attribute
EventHandler
onload
attribute
EventHandler
onabort
attribute
EventHandler
onerror
attribute
EventHandler
onloadend
};
FileReader
has an associated
state
that is
"empty"
"loading"
, or
"done"
. It is initially
"empty"
FileReader
has an associated
result
null
, a
DOMString
or an
ArrayBuffer
). It is initially
null
FileReader
has an associated
error
null
or a
DOMException
). It is initially
null
The
FileReader()
constructor,
when invoked, must return a new
FileReader
object.
The
readyState
attribute’s getter,
when invoked, switches on
this
’s
state
and runs the associated step:
"empty"
Return
EMPTY
"loading"
Return
LOADING
"done"
Return
DONE
The
result
attribute’s getter,
when invoked, must return
this
’s
result
The
error
attribute’s getter,
when invoked, must return
this
’s
error
FileReader
fr
has an associated
read operation
algorithm,
which given
blob
, a
type
and an optional
encodingName
runs the following steps:
If
fr
’s
state
is
"loading"
throw an
InvalidStateError
DOMException
Set
fr
’s
state
to
"loading"
Set
fr
’s
result
to
null
Set
fr
’s
error
to
null
Let
stream
be the result of calling
get stream
on
blob
Let
reader
be the result of
getting a reader
from
stream
Let
bytes
be an empty
byte sequence
Let
chunkPromise
be the result of
reading a chunk
from
stream
with
reader
Let
isFirstChunk
be true.
In parallel
, while true:
Wait for
chunkPromise
to be fulfilled or rejected.
If
chunkPromise
is fulfilled, and
isFirstChunk
is true,
queue a task
to
fire a progress event
called
loadstart
at
fr
We might change
loadstart
to be dispatched synchronously,
to align with XMLHttpRequest behavior.
[Issue #119]
Set
isFirstChunk
to false.
If
chunkPromise
is fulfilled with an object whose
done
property is false and whose
value
property is a
Uint8Array
object, run these steps:
Let
bs
be the
byte sequence
represented by the
Uint8Array
object.
Append
bs
to
bytes
If roughly 50ms have passed since these steps were last invoked,
queue a task
to
fire a progress event
called
progress
at
fr
Set
chunkPromise
to the result of
reading a chunk
from
stream
with
reader
Otherwise, if
chunkPromise
is fulfilled with an object whose
done
property is true,
queue a task
to run the following steps and abort this algorithm:
Set
fr
’s
state
to
"done"
Let
result
be the result of
package data
given
bytes
type
blob
’s
type
, and
encodingName
If
package data
threw an exception
error
Set
fr
’s
error
to
error
Fire a progress event
called
error
at
fr
Else:
Set
fr
’s
result
to
result
Fire a progress event
called
load
at the
fr
If
fr
’s
state
is not
"loading"
fire a progress event
called
loadend
at the
fr
Note:
Event handler for the
load
or
error
events could have started another load,
if that happens the
loadend
event for this load is not fired.
Otherwise, if
chunkPromise
is rejected with an error
error
queue a task
to run the following steps and abort this algorithm:
Set
fr
’s
state
to
"done"
Set
fr
’s
error
to
error
Fire a progress event
called
error
at
fr
If
fr
’s
state
is not
"loading"
fire a progress event
called
loadend
at
fr
Note:
Event handler for the
error
event could have started another load,
if that happens the
loadend
event for this load is not fired.
Use the
file reading task source
for all these tasks.
6.2.1.
Event Handler Content Attributes
The following are the
event handler content attributes
(and their corresponding
event handler event types
that user agents must support on
FileReader
as DOM attributes:
6.2.2.
FileReader States
6.2.3.
Reading a File or Blob
The
FileReader
interface makes available several
asynchronous read methods
—​
readAsArrayBuffer()
readAsBinaryString()
readAsText()
and
readAsDataURL()
which read files into memory.
Note:
If multiple concurrent read methods are called on the same
FileReader
object,
user agents throw an
InvalidStateError
on any of the read methods that occur
when
readyState
LOADING
FileReaderSync
makes available several
synchronous read methods
Collectively, the sync and async read methods of
FileReader
and
FileReaderSync
are referred to as just
read methods
.)
The
readAsDataURL(
blob
method,
when invoked, must initiate a
read operation
for
blob
with
DataURL
The
readAsText(
blob
encoding
method,
when invoked, must initiate a
read operation
for
blob
with
Text
and
encoding
The
readAsArrayBuffer(
blob
method,
when invoked, must initiate a
read operation
for
blob
with
ArrayBuffer
The
readAsBinaryString(
blob
method,
when invoked, must initiate a
read operation
for
blob
with
BinaryString
Note:
The use of
readAsArrayBuffer()
is preferred over
readAsBinaryString()
, which is provided for backwards
compatibility.
6.2.3.5.
The
abort()
method
When the
abort()
method is called,
the user agent must run the steps below:
If
this
’s
state
is
"empty"
or if
this
’s
state
is
"done"
set
this
’s
result
to
null
and
terminate this algorithm
If
this
’s
state
is
"loading"
set
this
’s
state
to
"done"
and set
this
’s
result
to
null
If there are any
tasks
from
this
on the
file reading task source
in an affiliated
task queue
then remove those
tasks
from that task queue.
Terminate the algorithm
for the
read method
being processed.
Fire a progress event
called
abort
at
this
If
this
’s
state
is not
"loading"
fire a progress event
called
loadend
at
this
6.3.
Packaging data
Blob
has an associated
package data
algorithm,
given
bytes
, a
type
, a optional
mimeType
, and a optional
encodingName
which switches on
type
and runs the associated steps:
DataURL
Return
bytes
as a DataURL
[RFC2397]
subject to the considerations below:
Better specify how the DataURL is generated.
[Issue #104]
Text
Let
encoding
be failure.
If the
encodingName
is present, set
encoding
to the result of
getting an encoding
from
encodingName
If
encoding
is failure, and
mimeType
is present:
Let
type
be the result of
parse a MIME type
given
mimeType
If
type
is not failure,
set
encoding
to the result of
getting an encoding
from
type
’s
parameters
"charset"
].
If
blob
has a
type
attribute of
text/plain;charset=utf-8
then
getting an encoding
is run using
"utf-8"
as the label.
Note that user agents must parse and extract the portion of the Charset Parameter
that constitutes a
label
of an encoding.
If
encoding
is failure, then set
encoding
to
UTF-8
Decode
bytes
using fallback encoding
encoding
, and return the result.
ArrayBuffer
Return a new
ArrayBuffer
whose contents are
bytes
BinaryString
Return
bytes
as a binary string,
in which every byte is represented by a code unit of equal value [0..255].
6.4.
Events
The
FileReader
object must be the event target for all events in this specification.
When this specification says to
fire a progress event
called e
(for some
ProgressEvent
at a given
FileReader
reader
),
the following are normative:
6.4.1.
Event Summary
The following are the events that are
fired
at
FileReader
objects.
6.4.2.
Summary of Event Invariants
This section is informative.
The following are invariants applicable to
event firing
for a given asynchronous
read method
in this specification:
Once a
loadstart
has been fired,
a corresponding
loadend
fires at completion of the read,
UNLESS any of the following are true:
the
read method
has been cancelled using
abort()
and a new
read method
has been invoked
the event handler function for a
load
event initiates a new read
the event handler function for a
error
event initiates a new read.
Note:
The events
loadstart
and
loadend
are not coupled in a one-to-one manner.
This example showcases "read-chaining":
initiating another read from within an event handler while the "first" read continues processing.
// In code of the sort...
reader
readAsText
file
);
reader
onload
function
(){
reader
readAsText
alternateFile
);}
.....
//... the loadend event must not fire for the first read
reader
readAsText
file
);
reader
abort
();
reader
onabort
function
(){
reader
readAsText
updatedFile
);}
//... the loadend event must not fire for the first read
One
progress
event will fire when
blob
has been completely read into memory.
No
progress
event fires before
loadstart
No
progress
event fires after any one of
abort
load
, and
error
have fired.
At most one of
abort
load
, and
error
fire for a given read.
No
abort
load
, or
error
event fires after
loadend
6.5.
Reading on Threads
Web Workers allow for the use of synchronous
File
or
Blob
read APIs,
since such reads on threads do not block the main thread.
This section defines a synchronous API, which can be used within Workers
[Workers]
Workers can avail of both the asynchronous API (the
FileReader
object)
and
the synchronous API (the
FileReaderSync
object).
This interface provides methods to
synchronously read
File
or
Blob
objects into memory.
Exposed
=(
DedicatedWorker
SharedWorker
)]
interface
FileReaderSync
constructor
();
// Synchronously return strings
ArrayBuffer
readAsArrayBuffer
Blob
blob
);
DOMString
readAsBinaryString
Blob
blob
);
DOMString
readAsText
Blob
blob
optional
DOMString
encoding
);
DOMString
readAsDataURL
Blob
blob
);
};
6.5.1.1.
Constructors
When the
FileReaderSync()
constructor is invoked,
the user agent must return a new
FileReaderSync
object.
The
readAsText(
blob
encoding
method,
when invoked, must run these steps:
Let
stream
be the result of calling
get stream
on
blob
Let
reader
be the result of
getting a reader
from
stream
Let
promise
be the result of
reading all bytes
from
stream
with
reader
Wait for
promise
to be fulfilled or rejected.
If
promise
fulfilled with a
byte sequence
bytes
Return the result of
package data
given
bytes
Text
blob
’s
type
, and
encoding
Throw
promise
’s rejection reason.
The
readAsDataURL(
blob
method,
when invoked, must run these steps:
Let
stream
be the result of calling
get stream
on
blob
Let
reader
be the result of
getting a reader
from
stream
Let
promise
be the result of
reading all bytes
from
stream
with
reader
Wait for
promise
to be fulfilled or rejected.
If
promise
fulfilled with a
byte sequence
bytes
Return the result of
package data
given
bytes
DataURL
, and
blob
’s
type
Throw
promise
’s rejection reason.
The
readAsArrayBuffer(
blob
method,
when invoked, must run these steps:
Let
stream
be the result of calling
get stream
on
blob
Let
reader
be the result of
getting a reader
from
stream
Let
promise
be the result of
reading all bytes
from
stream
with
reader
Wait for
promise
to be fulfilled or rejected.
If
promise
fulfilled with a
byte sequence
bytes
Return the result of
package data
given
bytes
ArrayBuffer
, and
blob
’s
type
Throw
promise
’s rejection reason.
The
readAsBinaryString(
blob
method,
when invoked, must run these steps:
Let
stream
be the result of calling
get stream
on
blob
Let
reader
be the result of
getting a reader
from
stream
Let
promise
be the result of
reading all bytes
from
stream
with
reader
Wait for
promise
to be fulfilled or rejected.
If
promise
fulfilled with a
byte sequence
bytes
Return the result of
package data
given
bytes
BinaryString
, and
blob
’s
type
Throw
promise
’s rejection reason.
Note:
The use of
readAsArrayBuffer()
is preferred over
readAsBinaryString()
, which is provided for
backwards compatibility.
7.
Errors and Exceptions
File read errors
can occur when reading files from the underlying filesystem.
The list below of potential error conditions is
informative
The
File
or
Blob
being accessed may not exist
at the time one of the
asynchronous read methods
or
synchronous read methods
are called.
This may be due to it having been moved or deleted after a reference to it was acquired
(e.g. concurrent modification with another application).
See
NotFoundError
File
or
Blob
may be unreadable.
This may be due to permission problems that occur after a reference to a
File
or
Blob
has been acquired
(e.g. concurrent lock with another application).
Additionally, the
snapshot state
may have changed.
See
NotReadableError
User agents MAY determine that some files are unsafe for use within Web applications.
A file may change on disk since the original file selection,
thus resulting in an invalid read.
Additionally, some file and directory structures may be considered restricted by the underlying filesystem;
attempts to read from them may be considered a security violation.
See
§ 9 Security and Privacy Considerations
and
SecurityError
7.1.
Throwing an Exception or Returning an Error
This section is normative.
Error conditions can arise when reading a
File
or a
Blob
The
read operation
can terminate due to error conditions when reading a
File
or a
Blob
the particular error condition that causes the
get stream
algorithm to fail
is called a
failure reason
. A
failure reason
is one of
NotFound
UnsafeFile
TooManyReads
SnapshotState
, or
FileLock
Synchronous read methods
throw
exceptions of the type in the table below
if there has been an error owing to a particular
failure reason
Asynchronous read methods use the
error
attribute of the
FileReader
object,
which must return a
DOMException
object of the most appropriate type from the table below
if there has been an error owing to a particular
failure reason
or otherwise return null.
8.
A URL for Blob and MediaSource reference
This section defines a
scheme
for a
URL
used to refer to
Blob
and
MediaSource
objects.
8.1.
Introduction
This section is informative.
Blob (or object) URLs
are URLs like
blob:http://example.com/550e8400-e29b-41d4-a716-446655440000
This enables integration of
Blob
s and
MediaSource
s with other
APIs that are only designed to be used with URLs, such as the
img
element.
Blob URLs
can also be used to navigate to as well as to trigger downloads
of locally generated data.
For this purpose two static methods are exposed on the
URL
interface,
createObjectURL(obj)
and
revokeObjectURL(url)
The first method creates a mapping from a
URL
to a
Blob
and the second method revokes said mapping.
As long as the mapping exist the
Blob
can’t be garbage collected,
so some care must be taken to revoke the URL as soon as the reference is no longer needed.
All URLs are revoked when the global that created the URL itself goes away.
8.2.
Model
Each user agent must maintain a
blob URL store
blob URL store
is a
map
where
keys
are
valid URL strings
and
values
are
blob URL Entries
blob URL entry
consists of an
object
(of type
Blob
or
MediaSource
), and an
environment
(an
environment settings object
).
Note:
Specifications have to use the
obtain a blob object
algorithm to access a
blob URL entry
’s
object
Keys
in the
blob URL store
(also known as
blob URLs
are
valid URL strings
that when
parsed
result in a
URL
with a
scheme
equal to "
blob
",
an
empty host
, and a
path
consisting of one element itself also a
valid URL string
To
generate a new blob URL
, run the following steps:
Let
result
be the empty string.
Append the string "
blob:
" to
result
Let
settings
be the
current settings object
Let
origin
be
settings
’s
origin
Let
serialized
be the
ASCII serialization
of
origin
If
serialized
is "
null
", set it to an implementation-defined value.
Append
serialized
to
result
Append U+0024 SOLIDUS (
) to
result
Generate a UUID
[RFC4122]
as a string and append it to
result
Return
result
An example of a blob URL that can be generated by this algorithm is
blob:https://example.org/40a5fb5a-d56d-4a33-b4e2-0acf6a8e5f64
To
remove an entry from the blob URL store
for a given
url
run the following steps:
Let
store
be the user agent’s
blob URL store
Let
url string
be the result of
serializing
url
Remove
store
url string
].
8.3.
Dereferencing Model for blob URLs
To
resolve a blob URL
given a
URL
url
Assert
url
’s
scheme
is "
blob
".
Let
store
be the user agent’s
blob URL store
Let
url string
be the result of
serializing
url
with the
exclude fragment flag
set.
If
store
url string
exists
, return
store
url string
]; otherwise return failure.
Futher requirements for the parsing and fetching model for
blob URLs
are defined in the
[URL]
and
[Fetch]
specifications.
8.3.1.
Origin of blob URLs
This section is informative.
The origin of a blob URL is always the same as that of the environment that created the URL,
as long as the URL hasn’t been revoked yet. This is achieved by the
[URL]
spec looking up
the URL in the
blob URL store
when parsing a URL, and using that entry to return
the correct origin.
If the URL was revoked the serialization of the origin will still remain the same
as the serialization of the origin of the environment that created the blob URL,
but for opaque origins the origin itself might be distinct. This difference isn’t
observable though, since a revoked blob URL can’t be resolved/fetched anymore anyway.
8.3.2.
Access restrictions on blob URLs
Blob URLs
can only be fetched from environments where the
storage key
matches that of
the environment where the
blob URL
was created.
Blob URL
navigations are not subject
to this restriction.
8.3.3.
Lifetime of blob URLs
This specification extends the
unloading document cleanup steps
with the following steps:
Let
environment
be the
Document
’s
relevant settings object
Let
store
be the user agent’s
blob URL store
Remove from
store
any entries for which the
value
’s
environment
is equal to
environment
This needs a similar hook when a worker is unloaded.
8.4.
Creating and Revoking a blob URL
Blob URLs
are created and revoked using static methods exposed on the
URL
object.
Revocation of a
blob URL
decouples the
blob URL
from the resource it refers to,
and if it is dereferenced after it is revoked,
user agents must act as if a
network error
has occurred.
This section describes a supplemental interface to the URL specification
[URL]
and presents methods for
blob URL
creation and revocation.
Exposed
=(
Window
DedicatedWorker
SharedWorker
)]
partial
interface
URL
static
DOMString
createObjectURL
((
Blob
or
MediaSource
obj
);
static
undefined
revokeObjectURL
DOMString
url
);
};
The
revokeObjectURL(
url
static method must run these steps:
Let
urlRecord
be the result of
parsing
url
If
urlRecord
’s
scheme
is not "
blob
", return.
Let
entry
be
urlRecord
’s
blob URL entry
If
entry
is null, then return.
Let
isAuthorized
be the result of
checking for same-partition blob URL usage
with
entry
and the
current settings object
If
isAuthorized
is false, then return.
Remove an entry from the Blob URL Store
for
url
Note:
This means that rather than throwing some kind of error, attempting to revoke a URL that isn’t registered or that was registered from an environment in a different storage partition will silently fail.
User agents might display a message on the error console if this happens.
Note:
Attempts to dereference
url
after it has been revoked will result in a
network error
Requests that were started before the
url
was revoked should still succeed.
In the example below,
window1
and
window2
are separate,
but in the
same origin
window2
could be an
iframe
inside
window1
myurl
window1
URL
createObjectURL
myblob
);
window2
URL
revokeObjectURL
myurl
);
Since a user agent has one global
blob URL store
it is possible to revoke an object URL from a different window than from which it was created.
The
URL.
revokeObjectURL()
call
ensures that subsequent dereferencing of
myurl
results in a the user agent acting as if a
network error
has occurred.
8.4.1.
Examples of blob URL Creation and Revocation
Blob URL
s are strings that are used to
fetch
Blob
objects,
and can persist for as long as the
document
from which they were minted
using
URL.
createObjectURL()
—​see
§ 8.3.3 Lifetime of blob URLs
This section gives sample usage of creation and revocation of
blob URL
s with explanations.
In the example below, two
img
elements
[HTML]
refer to the same
blob URL
url
URL
createObjectURL
blob
);
img1
src
url
img2
src
url
In the example below,
URL.
revokeObjectURL()
is explicitly called.
var
blobURLref
URL
createObjectURL
file
);
img1
new
Image
();
img2
new
Image
();
// Both assignments below work as expected
img1
src
blobURLref
img2
src
blobURLref
// ... Following body load
// Check if both images have loaded
if
img1
complete
&&
img2
complete
// Ensure that subsequent refs throw an exception
URL
revokeObjectURL
blobURLref
);
else
msg
"Images cannot be previewed!"
);
// revoke the string-based reference
URL
revokeObjectURL
blobURLref
);
The example above allows multiple references to a single
blob URL
and the web developer then revokes the
blob URL
string after both image objects have been loaded.
While not restricting number of uses of the
blob URL
offers more flexibility,
it increases the likelihood of leaks;
developers should pair it with a corresponding call to
URL.
revokeObjectURL()
9.
Security and Privacy Considerations
This section is informative.
This specification allows web content to read files from the underlying file system,
as well as provides a means for files to be accessed by unique identifiers,
and as such is subject to some security considerations.
This specification also assumes that the primary user interaction is with the

element of HTML forms
[HTML]
and that all files that are being read by
FileReader
objects have first been selected by the user.
Important security considerations include preventing malicious file selection attacks (selection looping),
preventing access to system-sensitive files,
and guarding against modifications of files on disk after a selection has taken place.
Preventing selection looping
During file selection, a user may be bombarded with the file picker associated with

(in a "must choose" loop that forces selection before the file picker is dismissed)
and a user agent may prevent file access to any selections by making the
FileList
object returned be of size 0.
System-sensitive files
(e.g. files in /usr/bin, password files, and other native operating system executables)
typically should not be exposed to web content,
and should not be accessed via
blob URLs
User agents may
throw
SecurityError
exception for synchronous read methods,
or return a
SecurityError
exception for asynchronous reads.
This section is provisional; more security data may supplement this in subsequent drafts.
10.
Requirements and Use Cases
This section covers what the requirements are for this API,
as well as illustrates some use cases.
This version of the API does not satisfy all use cases;
subsequent versions may elect to address these.
Once a user has given permission,
user agents should provide the ability to read and parse data directly from a local file programmatically.
A lyrics viewer.
User wants to read song lyrics from songs in his plist file.
User browses for plist file.
File is opened, read, parsed, and presented to the user as a sortable, actionable list within a web application.
User can select songs to fetch lyrics.
User uses the "browse for file" dialog.
Data should be able to be stored locally so that it is available for later use,
which is useful for offline data access for web applications.
A Calendar App.
User’s company has a calendar.
User wants to sync local events to company calendar,
marked as "busy" slots (without leaking personal info).
User browses for file and selects it.
The
text/calendar
file is parsed in the browser,
allowing the user to merge the files to one calendar view.
The user wants to then save the file back to his local calendar file (using "Save As"?).
The user can also send the integrated calendar file back to the server calendar store asynchronously.
User agents should provide the ability to save a local file programmatically given an amount of data and a file name.
Note:
While this specification doesn’t provide an explicit API call to trigger downloads,
the HTML5 specification has addressed this.
The
attribute of the
element initiates a download,
saving a
File
with the name specified.
The combination of this API and the
attribute on
elements
allows for the creation of files within web applications,
and the ability to save them locally.
A Spreadsheet App.
User interacts with a form, and generates some input.
The form then generates a CSV (Comma Separated Variables) output for the user to import into a spreadsheet,
and uses "Save...".
The generated output can also be directly integrated into a web-based spreadsheet,
and uploaded asynchronously.
User agents should provide a streamlined programmatic ability to send data from a file to a remote server
that works more efficiently than form-based uploads today.
A Video/Photo Upload App.
User is able to select large files for upload,
which can then be "chunk-transfered" to the server.
User agents should provide an API exposed to script that exposes the features above.
The user is notified by UI anytime interaction with the file system takes place,
giving the user full ability to cancel or abort the transaction.
The user is notified of any file selections,
and can cancel these.
No invocations to these APIs occur silently without user intervention.
Acknowledgements
This specification was originally developed by the SVG Working Group. Many thanks to Mark Baker and Anne van Kesteren for their feedback.
Thanks to Robin Berjon, Jonas Sicking and Vsevolod Shmyroff for editing the original specification.
Special thanks to Olli Pettay, Nikunj Mehta, Garrett Smith, Aaron Boodman, Michael Nordman, Jian Li, Dmitry Titov, Ian Hickson, Darin Fisher, Sam Weinig, Adrian Bateman and Julian Reschke.
Thanks to the W3C WebApps WG, and to participants on the public-webapps@w3.org listserv