ReadableResource: A common interface for files, URLs, byte arrays, and other readable resources.

2 04 2007

In this article I’d like to explain a technique I’ve been using for the last few years to unify the handling of different types of resources that can be accessed via an InputStream (such as files, URLs, byte arrays, classpath resources etc). There are probably lots of libraries and frameworks that provide similar facilities, but I’m not immediately aware of any, and I’ve found this to be a useful approach, so it seems worth explaining.

The problem being addressed is that whilst InputStream is a useful abstraction for the many different types of resources that can be read from, InputStreams aren’t always convenient for passing around or storing.

If you want to pass an InputStream to a method, every caller needs to do the “opening” of the resource in order to obtain the InputStream, and has to cater for the relevant possible failures and checked exceptions. Similarly, as the caller is doing the “opening” of the resource it normally makes sense for the caller to also be responsible for closing the InputStream (typically using “try…finally”), and this also normally involves error handling. Whilst this code might in some cases belong in the caller, in many cases it more properly belongs in the code that actually uses the resource, and is just a burden and duplication for each caller. A more fundamental problem is that of timing: if you want to pass a resource to an object for it to use at some later time (perhaps repeatedly), passing an already-open InputStream and then closing it is entirely inappropriate.

One alternative is to actually read the content of the resource and pass it as a byte array. This can sometimes avoid the timing issue (if you’re not concerned about getting completely up-to-date content), and it makes the receiving method or object very flexible. It’s also convenient for testing. However, it puts an even greater burden on the callers, and requires the entire contents of the resource to be held in memory. Sometimes this is appropriate, but often it isn’t.

Another alternative is to pass the relevant resource itself, or at least its identity, rather than an InputStream. That is, pass a File or file name, or a URI/URL or its string representation, or a byte[], or the path of a classpath resource etc. This is simple and avoids having to obtain and pass an InputStream, but it restricts the receiving code to one specific type of resource (or requires separate methods for each type of resource that is supported).

So we have various convenient classes for each specific type of resource (e.g. File), and a more general InputStream class that caters for all types of resource but whose instances only exists as a result of actually “opening” the resource and is therefore somewhat inconvenient for passing around and storing.

I often want the best of both worlds – a class that can be used to represent any kind of resource from which content can be read, and that can be easily constructed, passed around and stored.

To provide this, I use an interface called “ReadableResource” to represent anything from which an InputStream can be obtained. In principle this consists of a “getInputStream” method that opens the resource and returns a new InputStream for it (which the caller then “owns” and is responsible for closing).

In outline (omitting imports, Javadoc and various details explained later) this is just:

interface ReadableResource {

    InputStream getInputStream() throws IOException;

}

This interface is then implemented by various concrete classes for various different types of resource (e.g. FileReadableResource, URLReadableResource, ClasspathReadableResource, ByteArrayReadableResource etc). Each such implementation provides appropriate constructors and internal state to identify and/or access the resource, and implements the getInputStream method to obtain and return an InputStream for the resource.

For example, in outline the FileReadableResource class is something like this (omitting imports, Javadoc, argument validation, and various other such details):

public class FileReadableResource implements ReadableResource {

    private final File file;

    public FileReadableResource(final File file) {
        this.file = file;
    }

    public InputStream getInputStream() throws FileNotFoundException {
        return new FileInputStream(file);
    }

}

In principle, that’s it:

  • You can write code that accepts a ReadableResource of any kind.
  • You can construct a ReadableResource to represent a file or any other type of resource for which there is a ReadableResource implementation, typically by means of a simple constructor call (with no need for exception handling or “finally” clause).
  • You can store ReadableResource instances, pass them around, and access their content whenever and wherever appropriate.
  • To add support for some other type of resource (for example, to read from a database BLOB), all you have to do is write the corresponding ReadableResource implementation class, and you can then use such resources with any existing code that takes a ReadableResource.
  • With a ByteArrayReadableResource for reading from byte arrays, it’s easy to test any code that takes a ReadableResource even if the code’s “real” use is for processing a file, URL or other awkward-to-simulate resource.

In practice, I’ve found the following to be useful refinements of the basic approach:

  • Define a separate ResourceNotFoundException (as a subclass of IOException) to be thrown if the resource does not exist or cannot be found. This is more specific than throwing an IOException, whilst generalizing the various different exceptions that might be thrown when opening different types of resource (for example, the FileNotFoundException thrown by the FileInputStream constructor). Each implementation of getInputStream can then attempt to open the resource, catch whatever exceptions that particular code might throw, and for any such failure throw a ResourceNotFoundException with a suitable message and the original exception as its cause.
  • Add a getDescription method to return a suitable description of the resource, for use in exception messages. Arguably toString could be used for this, but on the whole I’ve found it clearer to define this as a separate method (each implementation’s toString can then use this description, or the actual content of the resource, or both, or neither, as appropriate for each type of resource).
  • Add a boolean argument to getInputStream through which the caller can specify whether it requires the resource to exist or whether the caller regards the resource as optional. When true, getInputStream throws a ResourceNotFoundException if the resource does not exist or cannot be found, but when false it just returns null without throwing an exception. In particular, this is needed when implementing the PrioritizedReadableResources class (described below).
  • Arguably, ReadableResource could also be Serializable. However, I’ve not yet needed this and I’m not convinced that it’s possible and appropriate for all possible types of resources, so for the time being my own ReadableResource interface doesn’t extend Serializable.

So my own ReadableResource interface is actually more like the following (again ommitting imports, Javadoc, additional features not described in this article etc):

interface ReadableResource {

    InputStream getInputStream(final boolean mustExist) 
        throws ResourceNotFoundException;

    String getDescription();    

}

Another extension of this that I’ve found useful is to implement a “PrioritizedReadableResource” class that provides a ReadableResource constructed from a sequence of other ReadableResources (including, potentially, nested PrioritizedReadableResource instances) and that accesses the first such resource that actually exists. For example, this lets you construct a ReadableResource that uses a particular file if it exists, or looks for a file in some default location if the first file doesn’t exist, or looks for a particular classpath resource if neither file exists, or accesses a remote URL if none of those resources can be found.

Of course, the principle also applies to OutputStreams, so I also have a corresponding WriteableResource interface with a getOutputStream method and implementation classes such as FileWriteableResource and ByteArrayWritableResource.

As a final note, I’m sure I’ve heard something somewhere about recent work to provide a “unified” approach to accessing local and remote file systems, database content, in-memory files etc, which I’d imagine is somewhat along these lines but on a more complete and standardised basis. But for the life of me I can’t remember where I’ve seen this – whether it was a proposed JDK enhancement (part of new NIO?), or internal to some specific product such as Eclipse or NetBeans. Does this ring any bells with anyone?

More generally, has anyone else done this, or seen this in existing frameworks/libraries? I know the Spring framework has an InputStreamSource along these lines, but its Javadoc implies that its main motivation is for handling mail attachments, and its implementation classes are based on its “more extensive” Resource sub-interface which is a very different beast. I’m happy enough with my own existing code but if there’s a more general solution somewhere it might be worth considering, or at least interesting to compare against.

Advertisements

Actions

Information

5 responses

2 04 2007
Curt Cox

You might be thinking of this:

Unified I/O is open source java I/O library.
http://uio.imagero.com/

JSR 203 will provide a file system API, and thus could be used to abstract away the differences between local and remote file systems.

JSR 203: More New I/O APIs for the Java Platform (“NIO.2”)
http://jcp.org/en/jsr/detail?id=203

There are many existing similar APIs:

Extended Filesystem API (WebNFS)
http://docs.sun.com/app/docs/doc/806-1067/6jacl3e6g?a=view

NetBeans Filesystem API
http://www.netbeans.org/download/dev/javadoc/org-openide-filesystems/org/openide/filesystems/doc-files/api.html

Apache Commons VFS
http://jakarta.apache.org/commons/vfs/index.html

Eclipse File System (EFS)
http://wiki.eclipse.org/index.php/EFS

3 04 2007
Rick

Hi Mike

The idea of a Resource abstraction is a good one, absolutely.

I don’t see a whole lot of difference between the Spring Resource and the ReadableResource described in your blog post though. It doesn’t make a big difference at the end of the day, they’re all good. (Why is Spring’s Resource abstraction ‘a whole different beast though’?)

Mmm, having gone and read the class-level Javadoc for the Spring InputStreamSource, I certainly don’t infer that it is targeted at mail attachments…. that is just an ‘example’. I’d change the Javadoc if I thought that was the implication, but I just don’t see that.

Anyways, good post, cheers
Rick

3 04 2007
closingbraces

Thanks Curt,

Yes, I suspect it was JSR 203 I’d heard about, and couldn’t find it again because the JSR itself is at an early stage and is a bit vague on this subject (“service-provider interface for pluggable filesystem implementations” as one of several objectives).

I’d looked at uio.imagero.com but it seems rather specialized for “image i/o”, and javadoc is rather too skimpy for me. Apache Commons VFS seemed rather overblown and “file system”-focused for my simple needs, but then maybe that’s the right approach after all.

I think my issue is that “virtual file system” is at much higher level and tackles a far broader issue than the core mechanism I’m looking for (“filesystem” vs. “mechanism for reading byte streams”). So I’ve been seeing these full-blown file-system APIs and looking for suitable interfaces/mechanisms within them.

Maybe a VFS would do everything I want and is just far more than I need. Or maybe the ideal would be a suitable core mechanism with a VFS built on top of it. I’ll take a look at the APIs you listed and see what’s inside them.

Thanks for that.

3 04 2007
closingbraces

Rick,

I take your point re Spring’s InputStreamResource: it does say that “Useful as an abstract content source for mail attachments” is an example, it’s just that it’s the only example and also has specific notes about JavaMail within the only method’s Javadoc, so it kind of dominates the javadoc. But you’re right, there’s no reason to infer that it’s use (other than as a base for Resource) is just for mail attachments.

The “different beast” comment was based on Resource having createRelative, getFile, getFilename, getURL methods, plus the isOpen stuff about whether it’s a handle to an already-open stream. But I guess I didn’t look at them closely enough, so it all looked very “File” focused, and the ability to represent an open stream rather than providing a stream “on demand” seemed opposed to the core idea. Looking at the actual methods more carefully, it does look like what I’m after but just with extra methods that support file-like facilities and throw various exceptions (or return false) for resources that aren’t of that kind.

Thanks for clearing that up, cheers,
Mike

3 04 2007
closingbraces

A further thought… maybe there are two different viewpoints and approaches here:

– Unify different resource types by treating them all as having a “neutral” common base class/interface.

– Unify different resource types by treating them all as being one particular type of resource – specifically, files within a file system (with some facilities then being “unsupported” for some resources).

For the “file system” approach, I guess one could just as well base it on some other specific type of resource, e.g. treating all resources as URLs, or maybe URIs. Actually, that seems rather more appropriate – “Uniform Resource Identifier” is pretty much exactly it.

OK, it’s only words and how one looks at things, but it does make me wonder whether all these “Virtual File System”s based primarily on concepts such as File, filename, directories etc shouldn’t actually be “Virtual Resource System”s based primarily on URIs/URLs/URNs (and maybe REST etc).

Just one of those crazy lunchtime thoughts…

Mike

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




%d bloggers like this: