Why would you ask for zero bytes from a Java InputStream?

12 04 2010

When would one pass a length argument of zero to java.io.InputStream.read(byte[], int, int) so as to not read any bytes? Does anyone have a good example of when this is necessary or convenient?

The method’s javadoc shows that it explicitly caters for being passed a length of zero, but to me that looks like an unnecessary complication that has plenty of potential for misunderstanding, incorrect implementation by subclasses, and a risk of infinite loops in client code.

I’ve been trying to imagine what common situation might justify catering for a request to read zero bytes, but haven’t come up with anything convincing.

The actual wording of the Javadoc is “If len is zero, then no bytes are read and 0 is returned; otherwise…” and then goes on to explain it’s normal processing within the “otherwise…” clause, including the handling of end-of-stream and any IOExceptions that might occur. There are some separate paragraphs before and after this, and separate explanations of argument validation, but it seems quite clear that if the length is zero the “if len is zero” statement applies instead of the normal processing and its various conditions and outcomes.

At first glance that seems straightforward and simplifies things – the remainder of the rules only apply for non-zero lengths.

However, it’s not as simple as it seems:

  • If you’re already at end-of-stream, reading zero bytes will complete as normal, won’t change anything, and will return zero. It’s easy to see how a caller could get stuck in an infinite loop if they’re not explicitly checking for this. (Conversely, if the caller is explicitly checking for a result of zero, it wouldn’t appear to be any harder for the caller to instead check for a length of zero beforehand and avoid the call altogether). It also means you can’t use a read of zero bytes as a safe way of just checking whether you’ve reached at end-of-stream yet.
  • The javadoc says, quite separately, that an IOException is thrown if the stream is already closed. It isn’t clear which condition takes precedence if zero bytes are requested but the stream is also already closed. More generally it’s not clear whether this is specifying that an IOException SHOULD be thrown if the stream is already closed or just explaining that this MAY result in an IOException (i.e. if an attempt to actually use the stream happens to result in such an exception). So depending on how you read it, you can argue either that an attempt to read zero bytes when the stream is already closed should complete normally and return zero, or that it should throw an IOException.
  • It’s invalid to specify an offset and length that together exceed the size of the destination array (such that writing the bytes into the array would go out-of-bounds). This appears to apply even if the length is zero, and that is indeed how it’s implemented in the source code (at least, in the Sun JDK 6 source code). But this is somewhat inconsistent with the general treatment of a zero length as returning zero regardless of other issues (e.g. even if already at end-of-stream). Arguably it would be more appropriate and more consistent with the rest of the specification to completely ignore the offset and array arguments if the length is zero and you’re not actually going to read any bytes into the array.
  • If a call successfully reads one or more bytes but then encounters an exception, the read ends at that point and returns normally, with the exception then being thrown for the first byte of the next read. But if the next read is for zero bytes, it will complete successfully without even attempting a read, and won’t encounter the exception. Whilst that’s in keeping with the normal behaviour of the method, it’s yet another thing that callers asking for zero bytes might need to be aware of and cater for (depending on exactly what they’re doing and how the read of zero bytes arises).
  • InputStream implementations aren’t entirely consistent with this specification, even within the JDK. In particular, the Javadoc for java.io.ByteArrayInputStream says that it tests for end of stream and returns -1 prior to considering whether to read any bytes or return zero. Hence if a ByteArrayInputStream is at end of stream and you ask to read zero bytes, it gives you -1 to indicate end-of-stream rather than zero as specified by the underlying InputStream base class. With the various ambiguities noted above, third-party InputStream implementations of this method are probably even more likely to be inconsistent in how they handle reads of zero bytes.

So why isn’t a length argument of zero just prohibited? As far as I can see, the typical use of this method shouldn’t normally involve passing a length of zero, and any client code that really can result in a legitimate call for a length of zero is probably going to have to do something to explicitly handle it anyway (for example, to avoid getting stuck in a loop). The length is already required to be non-negative, so why isn’t it just required to be greater than zero instead? That would seem to be a lot simpler and less open to misinterpretation, misuse or incorrect implementation.

What am I missing? Can anyone enlighten me with a good example of something that benefits from being able to ask for zero bytes? That is, a relatively common use of InputStream.read(byte[], int, int) where passing a “len” argument of zero can actually occur, and where allowing this is significantly more convenient for callers than requiring the caller to explicitly check for and handle this case itself.

Please note that I’m not for a moment suggesting that something this well-established could realistically be changed at this point. I’m just curious as to why it is the way it is. It it a mistake? A lack of attention to detail that we’re now stuck with? Or is there a real good reason for it that I just haven’t come across yet?




4 responses

13 04 2010
Andrae Muys

Because the alternative would violate algebraic closure; possibly the most fundamental rule of good API design.

13 04 2010


Sounds intriguing but what does it mean in this context? Any reference material you could point me to?

I can’t immediately see why defining the length argument as positive integer rather than non-negative would violate some design rule… the returned value isn’t the same domain or usable as length argument anyway (e.g. return values include -1). Even if it were, precluding a zero length would also preclude a zero return value. And there are already other constraints on the length argument (length plus offset must not exceed array length).

But I don’t want to remain ignorant of a fundamental rule of good API design, so any reference material about this would be welcome !

7 02 2013

You might want to block until data is available without discarding any of the data.

8 02 2013

Thanks! Yes!

Re-reading the Javadoc, it’s not absolutely explicit that it will block rather than immediately returning zero anyway, but that does seem by far the most reasonable interpretation and behaviour, so I do see you could use it for that.

I’d previously thought the “available” method might fit that sort of requirement, but it doesn’t block and is very weakly defined (e.g. for InputStream always returns zero) so it’s clearly not a complete answer for such situations.

Presumably it’s not very common to want to block until readable but then not actually read anything (and depending on the InputStream implementation maybe it’s even possible for the stream to block again immediately after you detect that it’s not blocked) – but I can imagine there are such situations.

So you’ve answered my question!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: