This is something I have been hacking together for a while now; FS is a file-system abstraction for Python. It has reached a stable state and is worthy of an official (0.1.0) release.

First, a brief history of the project. A while back, I was working mainly with desktop applications in wxPython. I found that I had a number of sources for files; there were read-only resources such as images, per-user files and per-installation config files. The location of these files would change depending on whether I was debugging or building a release version. The logic required to manage all this was pretty ugly and error-prone. So I wrote a collection of classes to bring all these disparate locations for files under a single virtual file-system. For example, to open a config file I could just read 'user/settings.cfg', and to open an image resource I could just read from 'resources/logo.png' and the virtual file-system would do the right thing and return a file-like object.

This turned out to be insanely useful, and I used it for a number of projects. I also used it while working for ICC,  and I had the opportunity to enhance it based on feedback from colleagues. In the last few months I have re-written it from scratch, because I wanted to avoid any copyright issues, but mainly because I could make a better job of it the second time around.

Getting Started

You can install FS with the command 'easy_install fs', or manually by downloading the latest release from (http://code.google.com/p/pyfilesystem/downloads/list) and running 'python setup.py install'. FS has been checked on Linux and Windows, but should run anywhere since ultimately it uses the standard library. There is no documentation other than the docstrings at the moment, so this post will have to suffice till I write up the API.

The main module is called fs, and there are a number of sub-modules:

  • fs.osfs Contains the class OSFS, which is a simple layer around the operating systems own file-system.
  • fs.memoryfs Contains the class MemoryFS, which is a file-system that exists only in memory.
  • fs.mountfs Contains the class MountFS which can be used to mount other file-systems at various places in the directory structure.
  • fs.multifs Contains the class MultiFS which creates a file-system from other file-systems which are tried in-order till a file operation succeeds.
  • fs.zipfs Contains the class ZipFS which creates a file-system from a zip file.
  • fs.tempfs Contains the class TempFS which create a temporary file-sytem that can be automatically cleaned up.
  • fs.utils Contains a number of functions for dealing with FS objects.
  • fs.browsewin Contains the 'browse' function which opens a tree view of the file-system passed to it. This is mainly a debugging aid. Requires wxPython.

Examples

In lieu of documentation, I'm just going to run through an interactive session that shows a few features.

>>> from fs import *
>>> home = osfs.OSFS('~/')
>>> from fs.browsewin import browse
>>> browse(home)

This creates an object called 'home' which represents your home directory (use a different path if you are on a non-linux system). You can use it to open files, list directory contents, copy files etc. But it can never access files that aren't under your home directory, so it could be considered a sand box view on to the underlaying file-system of the OS. The root of the home object is re-positioned, so the path "/readme.txt" would map to "~/home/readme.txt". Most of the methods of an FS object are pretty self-explanatory, but there are a few that require explanation. Try the following for example:

>>> projects = home.opendir('projects') # Assumes there is a directory called ~/projects
>>> projects
<SubFS: /projects in <OSFS: /home/will>>
>>> browse(projects)

There is no concept of a working directory for FS objects. Rather than something like 'chdir' there is an opendir method that returns a new FS object representing everything under the sub-directory root. So if you were to do projects.open("test.py"), it would return a file object for "~/projects/test.py".

Have you ever wondered how much space the .py files take up in your home directory? Give this a try:

>>> print sum(home.getsize(path) for path in home.walkfiles(wildcard="*.py"))

Here's how you could use ZipFS to archive your projects folder:

>>> projects_archive = zipfs.ZipFS('projects.zip', 'w')
>>> from fs.utils import copydir
>>> copydir(projects, projects_archive)
>>> projects_archive.close()

If you are interested in learning more, have a look at the docstrings in base.py, or just ask me. I promise to get proper documentation up soon.

FS is politeware,  you can use it for any purpose you like, as long as you say thanks!

Disclaimer: FS has been well tested, but as there is file access involved, be careful!
This blog post was posted to It's All Geek to Me on Sunday September 21st, 2008 at 11:58AM
 

17 Responses to "Announcing FS 0.1.0, a Python file-system"

  • September 21st, 2008, 1:39 p.m.

    Have you thought any fuse support?

  • Stavros
    September 21st, 2008, 2:14 p.m.

    Hmm, I have created something similar, omnisync: https://launchpad.net/omnisync.

    It's a file synchroniser but it uses virtual file systems for synchronising files, so you can just specify an sftp/s3/local server or whatever and it will sync the two directories. I thought of doing a zip module as well but since zips don't support some features such as arbitrary writes, I decided against it. How does FS handle that?

  • September 21st, 2008, 2:16 p.m.

    Does it work with Windows?

  • September 21st, 2008, 2:19 p.m.

    Ilpo, don't know much about Fuse, but the OSFS object will expose the linux filesystem. Exposing a FS object to Fuse is not something I have plans to do, but I'm sure its possible!

    Stavros, thanks for the link, will check it out. The ZipFS object creates temporary files for writing. When the temp file is closed, its contents are added to the zip.

  • September 21st, 2008, 2:19 p.m.

    Michael, yes it does. :-)

  • Jussi
    September 21st, 2008, 2:39 p.m.

    How about returning information about the current path object when there is not parameter given? For example:

    tmp = osfs.OSFS('C:\\tmp')
    # This requires name of the directory and will fail.
    tmp.isdirempty()
    # This does what I wanted though.
    tmp.isdirempty(".")

    There are others too: getinfo, isdir, isfile,...
    listdir works like that :)

    Anyway... Nice work! This is just what I needed. Thank you.

  • September 21st, 2008, 3:16 p.m.

    Cool :-)

  • September 21st, 2008, 3:35 p.m.

    Twisted has something similar called FilePath.

    http://twistedmatrix.com/documents/8.1.0/api/twisted.python.filepath.FilePath.html

    It's got some interesting advantages; for example, there's a Zip implementation which can stream data out of an archive rather than reading the entire contents of an individual file into memory.

    http://twistedmatrix.com/documents/8.1.0/api/twisted.python.zippath.ZipArchive.html

    Another advantage of this is that you can get access to a FilePath object from twisted.python.modues, allowing something sort of like setuptools' pkg_resources (you can load resources from "next to your code" whether it's in a zip file or not), but with much less overhead.

    http://twistedmatrix.com/documents/8.1.0/api/twisted.python.modules.html

    But your offering here looks much more coherent and polished. We could really use some help maintaining, documenting, and promoting it, if you would be willing to join forces :).

  • Doug Napoleone
    September 21st, 2008, 7:04 p.m.

    One place where this could be of a huge benefit is on the Google App Engine platform which does not have a 'real filesystem' and most of the os/os.path features are missing.

    Unlike twisted it looks like it would not take much to get a subset working under GAE (the wx browser stuff would need a GAE html interface).

    This would be a huge help to getting existing projects up and running under GAE.

  • September 22nd, 2008, 3:29 p.m.

    This does look quite nifty. Have you checked out Jason Orendorff's path.py module? Though the aims are different, there is some overlap in that path.py is aiming to make working with files easier (and I've found that it does indeed make working with files much easier).

  • September 22nd, 2008, 5:22 p.m.

    Doug, when you said "Unlike twisted it looks like it would not take much to get a subset working under GAE", what exactly did you mean? FS 0.1.0 isn't comparable to Twisted. It is comparable to one module in Twisted, twisted.python.filepath. Do you mean that twisted.python.filepath would be much harder to get working on GAE than FS 0.1.0 would be? If so, can you explain why?

    (Will, sorry, I hope this doesn't devolve into some boring off-topic discussion, but I'm really curious about what Doug meant.)

  • September 22nd, 2008, 7:58 p.m.

    Jussi, I'll give that some thought!

    Glyph, I was disappointed that the zipfile module in the standard library couldn't stream zipped files. I'm all for joining forces, but I'd like to concentrate on my interface. I may borrow your zip implementation -- in the spirit of open source. :-)

    Doug, Good idea. Although I recall reading about someone who worked around the limit. Putting templates in a zip file perhaps.

    Kevin, I have used Jasons path.py module. Very useful it is too, but I didn't like putting all that functionality in a class derived from a string. It didn't sit well for me, from an OO point of view. It can still be used with FS though.

    Jean, boring off-topic discussions are always welcome here!

  • September 29th, 2008, 7:24 a.m.

    A comments about Jussi's comments. It's a good idea although it adds a bit of "magic" which isn't always a good thing, although I am somewhat in favour of doing this. Jussi should note though that you can just pass in an empty string. You don't have to pass in "." in order to get information about the root.

    I have to complement you on your code style. I think how you chose to make all those simple methods in osfs (like exists, isdir, isfile) into 2-liners, instead of one-liners.

    That's cool that you got within-zipfile copying working. I finally figured out how you did it after looking at the code for a while.

    I think remove() could work too. We could just lazily store all the remove operations and then when the zip file is closed, re-create the entire zip file from scratch with those files removed.

  • jaross
    March 11th, 2009, 4:40 p.m.

    what about itools.vfs? http://docs.hforge.org/itools/vfs.html [docs.hforge.org]

  • Andy
    August 2nd, 2009, 12:32 a.m.

    Hi Will - just downloaded this and it looks great! Thank you very much for doing it!

    Just a quick question…. I'm just wanting to confirm what the license for “fs” is. “Politeware” is mentioned above, but in the “pkg-info” file it mentions the Python license. So, if you could confirm which one it is, that'll be great!

    Thanks again - bye for now -

    - Andy

  • Evan Driscoll
    August 27th, 2009, 12:05 a.m.

    I was looking for something I could use as a mock file system for testing purposes, and came across this. Very helpful; thanks!

    I have a couple suggestions, and a function that you may find useful.

    First, the suggestions:

    - Make MemoryFile into a context manager so you can say

    “with fs.memoryfs.open(…) as file:” as you can with the open

    builtin. (I can work around this easily enough though:

    def enterMemoryFile(self):
    pass

    def exitMemoryFile(self, a,b,c):
    self.close()

    mfs.MemoryFile.__enter__ = enterMemoryFile
    mfs.MemoryFile.__exit__ = exitMemoryFile

    (I imported fs.memoryfs as mfs)

    - Support the notion of a current directory

    - Make your path-handling more path-separator agnostic. One great

    suggestion I saw was to make a mock filesystem where os.sep was

    something very weird; but you seem to depend on it being either

    / or \:

    >>> import os
    >>> os.sep='#'
    >>> from fs.memoryfs import MemoryFS
    >>> fs = MemoryFS()
    >>> fs.makedir("a#b", recursive=True)
    <MemoryFS>
    >>> fs.makedir("a/c", recursive=True)
    <MemoryFS>
    >>> fs.makedir("a\\d", recursive=True)
    <MemoryFS>
    >>> for e in fs.walk('/a'):
    ... print e
    ...
    '/a', [])
    ('/a/d', [])
    ('/a/c', [])

    Now a handy function. Instead of a bunch of makedir and createfile calls to set up the file system, I wrote a function that will let you basically write a file system as a bunch of nested dictionaries. The keys to dictionaries are the file/directory names, and the values are the contents. String contents are files with that string, and dictionary contents are subdirectories.

    def create_mock_filesystem(fs_dict):
    fs = memoryfs.MemoryFS()
    _create_mock_filesystem_impl(fs_dict, fs.opendir('/'))
    return fs


    def _create_mock_filesystem_impl(fs_dict, subfs):
    for direntry, contents in fs_dict.iteritems():
    assert type(direntry) is str

    if type(contents) is str:
    # direntry is a file
    subfs.createfile(direntry, contents)
    elif type(contents) is dict:
    # direntry is a directory
    subfs.makedir(direntry)
    _create_mock_filesystem_impl(contents, subfs.opendir(direntry))
    pass
    else:
    assert False

    To wit:

    >>> fs_dict = {
    ... 'file.txt' : 'I am a file!',
    ... 'some_dir' : {
    ... 'file.txt' : 'I am another file!\nIn fact, I have two lines!',
    ... }
    ... }
    >>> fs = create_mock_filesystem(fs_dict)
    >>> for name in ['file.txt', '/some_dir/file.txt']:
    ... print "Contents of file", name
    ... with fs.open(name) as file:
    ... for line in file:
    ... print " ", line.rstrip()
    ...
    Contents of file file.txt
    I am a file!
    Contents of file /some_dir/file.txt
    I am another file!
    In fact, I have two lines!

    So thanks again, you saved me from a bit of reimplementation. ;-)

  • Evan Driscoll
    August 27th, 2009, 12:07 a.m.

    Damn, there's a big mistake in my last post… I pasted a version of enterMemoryFile from before I tested it and fixed a bug. That function needs to return self.

    Here's the actual code I have now:

    # Allow MemoryFile to be used as a context manager ('with fs.open(...) as file:')
    try:
    memoryfs.MemoryFile.__enter__
    except AttributeError:
    def _memoryfile_enter(self):
    return self
    memoryfs.MemoryFile.__enter__ = _memoryfile_enter

    try:
    memoryfs.MemoryFile.__exit__
    except AttributeError:
    def _memoryfile_exit(self, a, b, c):
    self.close()
    memoryfs.MemoryFile.__exit__ = _memoryfile_exit

Leave a Comment

You can use bbcode in the comment: e.g. [b]This is bold[/b], [url]http://www.willmcgugan.com[/url], [code python]import this[/code]
Preview Posting...
Previewing comment, please wait a moment...

My Tweets

Will McGugan

My name is Will McGugan. I am an unabashed geek, an author, a hacker and a Python expert – amongst other things!

Search for Posts
Possibly related posts
Tags
Popular Tags
 
Archives
2013
 
Recent Comments
Nice one. How can change link and text for every image.
Beautiful. Just beautiful. Thank you.
- Tyler Troy on Going sub-pixel with PyGame
Sorry for the double comment my browser is very slow.
Hi Will I get the following error when i try to run simpleopengl.py. Traceback (most recent call last): File firstopengl.py, ...
Hi Will I get the following error when i try to run simpleopengl.py. Traceback (most recent call last): File firstopengl.py, ...
 
© 2008 Will McGugan.

A technoblog blog, design by Will McGugan