Serve FTP with Python and PyFilesystem

April 25th, 2012

Ben Timby has committed code to PyFilesystem that lets you expose any filesystem over FTP. We've had the ability to serve filesystems over SFTP (secure ftp) and XMLRPC for a while, but plain old FTP was a glaring omission–until now.

You can serve the current directory programatically with something like the following:

from fs.expose.ftp import serve_fs
from fs.osfs import OSFS
serve_fs(OSFS('.'), '127.0.0.1', 21)

The same functionality is also available to the fsserve command. The following is equivalent to the above code, but from the command line:

fsserve -t ftp .

You'll probably need root privileges (i.e. sudo) on Linux for these examples.

With the server running, you can browse the files on your home directory with an ftp client, or by typing “ftp://127.0.0.1” in to your browser. Any of the other supported filesystems can be served in the same way.

FTP has been around since the dawn of the internet, so just about any network enabled device will be able to access files exposed this way. It's a great way of creating a gateway to other filesystems. You could expose files stored on Amazon S3 for example.

You'll need to check out the latest code from SVN to try this out.

Update: Ben has posted more about this.

 

“ RT @worrp: Proof of concept Riak-backed virtual filesystem - https://t.co/EagEcjU9 #riak #python #pyfilesystem /cc @willmcgugan // Cool ”

0
 

“ Creating a virtual filesystem with #PyFilesystem http://is.gd/JF3X2t ”

0
 

Creating a Virtual Filesystem with Python (and why you need one)

March 20th, 2011

If you are writing an application of any size, it will most likely require a number of files to run – files which could be stored in a variety of possible locations. Furthermore, you will probably want to be able to change the location of those files when debugging and testing. You may even want to store those files somewhere other than the user's hard drive.

Any engineer worth his salt will recognise that the file locations should be stored in some kind of configuration file and the code to read the files in question should be factored out so that it isn't just scattered at points where data is read or written. In this post I'll present a way of doing just that by creating a virtual filesystem with PyFilesystem.

You'll need the most recent version of PyFilesystem from SVN to run this code.

We're going to create a virtual filesystem for a fictitious application that requires per-application and per-user resources, as well as a location for cache and log files. I'll also demonstrate how to mount files located on a web server. Here's the code:

from fs.opener import fsopendir
app_fs = fsopendir('mount://fs.ini', create_dir=True)

That's, all there is to it; two lines of code (one if you don't count the import). Obviously there is quite a bit going on under the hood here, which I'll explain below, but lets see what this code gives you…

The app_fs object is an interface to a single filesystem that contains all the file locations our application will use. For example, the path /user/app.ini references a per-user file, whereas /resources/logo.png references a per application file. The actual physical location of the data is irrelevant because as far as your application is concerned the paths never change. This abstraction is useful because the real path for such files varies according to the platform the code is running on; Windows, Mac and Linux all have different conventions, and if you put your files in the wrong place, your app will likely break on one platform or another.

Here's how a per-user configuration file might be opened:

from ConfigParser import ConfigParser
# The 'safeopen' method works like 'open', but will return an
# empty file-like object if the path does not exist
with app_fs.safeopen('/user/app.ini') as ini_file:
    cfg = ConfigParser()
    cfg.readfp(ini_file)
    # ... do something with cfg

The files in our virtual filesystem don't even have to reside on the local filesystem. For instance, /live/ may actually reference a location on the web, where the version of the current release and a short ‘message of the day’ is stored.

Here's how the version number and MOTD might be read:

def get_application_version():
    """Get the version number of the most up to date version of the application,
    as a tuple of three integers"""
    with app_fs.safeopen('live/version.txt') as version_file:
        version_text = version_file.read().rstrip()
    if not version_text:
        # Empty file or unable to read
        return None
    return tuple(int(v) for v in version_text.split('.', 3))

def get_motd():
    """Get a welcome message"""
    with app_fs.safeopen("live/motd.txt") as motd_file:
        return motd_file.read().rstrip()

You'll notice that even though the actual data is retrieved over HTTP (the files are located here and here), the code would be no different if the files were stored locally.

So how is all this behaviour created from a single line of code? The line fsopendir("mount://fs.ini", create_dir=True) opens a MountFS from the information contained within an INI file (create_dir=True will create specified directories if they don't exist). Here's an example of an INI file that could be used during development:

[fs]
user=./user
resources=./resources
logs=./logs
cache=./user/cache
live=./live

The INI file is used to construct a MountFS, where the keys in the [fs] section are the top level directory names and the values are the real locations of the files. In above example, /user/ maps on to a directory called user relative to the current directory – but it could be changed to an absolute path or to a location on a server (e.g. FTP, SFTP, HTTP, DAV), or even to a directory within a zip file.

You can change the section to use in a mount opener by specifying it after a # symbol, i.e. mount://fs.ini#mysection

There are a few changes to this INI file we will need to make when our application is ready for release. User data, site data, logs and cache all have canonical locations that are derived from the name of the application (and the author on Windows). PyFilesystem contains handy openers for these special locations. For example, appuser://examplesoft:myapp detects the appropriate per-user data location for an application called “myapp” developed by “examplesoft”. Ditto for the other per-application directories. e.g.:

[fs]
user=appuser://examplesoft:myapp
resources=appsite://examplesoft:myapp
logs=applog://examplesoft:myapp
cache=appcache://examplesoft:myapp

The /live/ path is different in that it needs to point to a web server:

live=http://www.willmcgugan.com/static/cfg/

Of course, you don't need to use the canonical locations. For instance, let's say you want to store all your static resources in a zip file. No problem:

resources=zip://./resources.zip

Or you want to keep your user data on a SFTP (Secure FTP) server:

user=sftp://username:password@example.org/home/will/

Perhaps you don't want to preserve the cache across sessions, for security reasons. The temp opener creates files in a temp directory and deletes them on close:

cache=temp://

Although, if you are really paranoid you can store the cache files in memory without ever writing them to disk:

cache=mem://

Setting /user/ to mem:// is a useful way of simulating a fresh install when debugging.

I hope that covers why you might need – or at least want – a virtual file system in your application. I've glossed over some the details and other features of PyFilesystem. If you would like more information, see my previous posts, check out the documentation or join the PyFilesystem discussion group.

 

“ Sneak Preview of New Features in #PyFilesystem http://is.gd/jUcHz ”

0
 

Sneak Preview of New Features in PyFilesystem 0.4

January 1st, 2011

There have been some pretty exciting developments in PyFilesystem since version 0.3 was released – Ryan Kelly and myself have been hard at work, and there have been a number of excellent contributions to the code base from other developers. Version 0.4 will be released some time in January, but I'd like to give you a preview of some new features before the next version lands.

Pyfilesystem is a Python module that provides a simplified common interface to many types of filesystem.

It is now possible to open any of the supported filesystems from a URL in this format, which makes it very easy to specify a filesystem (or individual file) from the command line or a config file. Here's a quick example that opens a bunch of quite different filesystems:

from fs.opener import fsopendir
projects_fs = fsopendir('/projects')
zip_fs = fsopendir('zip:///foo/bar/baz.zip')
ftp_fs = fsopendir('ftp://ftp/mozilla.org')
sftp_fs = fsopendir('sftp://example.org')

If you used Pyfilesystem in your application you could trivially change where your files are physically located, or where you save generated files to.

You can also open a file directly without the need to explicitly open the filesystem it is contained within, with the fsopen function, e.g.:

from fs.opener import fsopen
print fsopen('zip:///foo/bar/baz.zip!dir/somefile.txt').read()
print fsopen('ftp://ftp.mozilla.org/pub/README').read()

fsopen is very similar to the builtin open method and will return a file-like object of some kind. In fact, if you pass in a system path it works as open would (although the exceptions will be an instance of fs.errors.FSError rather than IOError).

Because of this similarity with the builtin, fsopen could be used to shadow open, and instantly add the ability for an application to open files on mediums other than a system drive. This is all it takes:

from fs.opener import fsopen as open

FS Commands

Version 0.4 also adds a number of applications that mirror some of the standard command line apps, but extend their functionality with FS URLs. For example, fsls, functions just like the ls command, but works with any of the supported filesystems:

will@will-linux:~$ fsls .
will@will-linux:~$ fsls zip://myzip
will@will-linux:~$ fsls ftp://ftp.mozilla.org/pub
will@will-linux:~$ fsls sftp://user:pass@securesite.com/home/will

You can also copy and move files between filesystems with fscp and fsmv, which work in a very similar manner to their cp and mv counterparts, with a few extensions such as multi-threaded copying (great for network filesystems) and an ascii progress bar. The following example copies all the .py files in my projects directory to zip file on an ftp server, and displays an progress bar to boot.

will@will-linux:~$ fscp ~/projects/*.py zip:ftp://will:password@example.org/backups/code.zip

Then there is the fscat command that writes a file in a filesystem to the terminal. The following example displays a python file in the zip file that we created with the previous command:

will@will-linux:~$ fscat zip:ftp://will:password@example.org/backups/code.zip!pythonrocks.py

The other commands fsmkdir, fsrm and fstree work as you may expect.

Serving

The fsserve command adds the ability to serve any of the supported filesystems over a number of protocols. The default is to serve http – in effect creating a webserver. The following is all it takes to serve the contents of your current working directory:

will@will-linux:~$ fsserve

Now if you point your browser at http://127.0.0.1 you will see a web-page with the contents of your current working directory (or a index.html file if present). It's not the most bullet-proof of web servers, but handy if you quickly want to serve some files. Naturally, fsserve works with any filesystem you pass to it. You could, for instance, serve the contents of a zip file without ever explicitly unzipping it, or create an ftp to http gateway by serving an ftp filesystem. The following command creates a ftp to http gateway for ftp.mozilla.org:

will@will-linux:~$ fsserve ftp://ftp.mozilla.org

You also have the option of serving a filesystem over SFTP (Secure FTP), or by RPC (Remote Procedure Call). Either of these two methods expose all the functionality of the remote filesystem, so you could run a server on one machine and create/move/copy/delete files from another machine on the network (or internet). For example, the following would serve the current working directory on localhost, port 3000:

will@will-linux:~$ fsserve -t rpc -p 3000

You can then connect to that server from another machine on the network. Assuming my local IP is 192.168.1.64 the following would display the directory structure from another machine on my network:

will@will-linux:~$ fstree rpc://192.168.1.64:3000

Mounting

Any of the filesystems can be mounted on the OS with the fsmount command, which uses FUSE on Linux or DOKAN on Windows. The advantage of this is that the filesystems exposed in Python can be used in any application, and browsed with Explorer or Nautilus. The following creates a ram drive on Linux:

will@will-linux:~$ fsmount mem:// mem

Or on Windows:

C:\> fsmount mem:// M

Get the Code

There is no documentation online for the new features as yet, but if you are a brave soul and want to experiment with any of the above commands then download the code from SVN and run python setup.py install. The command line apps all have a -h which which displays help on the various options.

Bear in mind that these commands are still somewhat experimental, and some of these commands have the capacity to delete files – so be careful. That said, I'm confident to use them for my day-to-day work.

Please see the projects page page if you want to report bugs or discuss Pyfilesystem with myself and the other developers.

 

“ #PyFilesystem has been packaged for Debian. :-) http://is.gd/diQ2M ”

0
 

PyFilesystem 0.3 released

June 20th, 2010

I am pleased to announce a new version of PyFilesystem (0.3), which is a Python module that provides a common interface to many kinds of filesystem. Basically it provides a way of working with files and directories that is exactly the same, regardless of how and where the file information is stored. Even if you don't plan on working with anything other than the files and directories on your hard-drive, PyFilesystem can simplify your code and reduce the potential of error.

PyFilesystem is a joint effort by myself and Ryan Kelly, who has created a number of new FS implementations such as Amazon S3 support and Secure FTP, and some pretty cool features such as FUSE support and Django storage integration.

As an example of how awesome this package is, take a look at the following 6 lines of code, which creates a ramdrive:

from fs.osfs import OSFS
from fs.memoryfs import MemoryFS
from fs.expose import fuse

home_fs = OSFS('~/')
home_fs.makedir('ramdrive', allow_recreate=True)
fuse.mount(MemoryFS(), home_fs.getsyspath('ramdrive'))

If you run this, a directory called ramdrive will appear in your home folder, the contents of which are stored purely in memory.

I prepared a screencast that gives a quick demonstration of some features – because if a picture is worth a thousand words, this video must be worth fifteen thousand words a second:

PyFilesystem screencast from Will McGugan on Vimeo.

See the project page on google code for more information, including API docs. There are also a couple of blog posts that will give a some more context.

This release has reached a good level of stability and maturity. I'd like to invite as many Pythonistas as possible to check out this module and possibly contribute to the project.

 

So what can you do with FS anyway?

September 28th, 2008

After my last post regarding FS, my file-system abstraction layer for Python, I think I may have left people thinking,"that's nice, but what would you use if for?". Generally, I see it as a way of simplifying file access and exposing only the files you need for your application -- regardless of their physical source. But I can think of a few other uses that may be a littler cooler.

Instant Archive Format

The osfs.MemoryFS class is a filesytem that exists only in memory. You can create files / directories and copy files to it from other filesystem objects, but when you open a file you will get back a StringIO object (from the standard library). Naturaly the files in a MemoryFS are transitory, but as everything is in memory, file access is very fast. I was thinking that since the files and directories are stored as simple Python objects, then a MemoryFS can be pickled -- creating an instant archive format. Uncompressing this archive would simply require unpicking and using the resultant object. I can't see any advantages of this over zip or tar archives, but if you don't have them available, this would be a workable substitute!

Database Filesystem

It wouldn't be particularly challenging to write an FS an object that created a filesystem on a database. Actualy, if using an ORM like SQLAlchemy, SQLObject, Storm or Django's database layer it would be fairly trivial. So what would you use such a thing for? I can think of one use which may make it worth the effort, and that is the ability to store templates for a Python web application in the database, rather than (or in addition to) files. Consider if you will, a situation where the client has requested a change that is absolutely critical to the success of his company and he won't take "it will be in the next release" for an answer, and neither will your project manager. Wouldn't it be nice if you could go in to the Django admin pages and just type in the changes, without making a new release. I'm tempted to implement such a thing, if nobody has done so already!

Proxying a Filesystem

A generic filesystem server could be written that exposes a remote file-system over whatever channel is available. I know there are some standard ways of doing this, but they might not be available on a given platform. I'm thinking of small devices with limited capabilities, or secure environments. The server would serve an FS object, and the client would present the app with a corresponding interface that pulled files and directories over the wire.

There you go. If anyone implements these, let me know!

 

Announcing FS 0.1.0, a Python file-system

September 21st, 2008

This is something I have been hacking together for a while now; FS is a file-system abstraction for Python. It has reached a stable state and is worthy of an official (0.1.0) release.

First, a brief history of the project. A while back, I was working mainly with desktop applications in wxPython. I found that I had a number of sources for files; there were read-only resources such as images, per-user files and per-installation config files. The location of these files would change depending on whether I was debugging or building a release version. The logic required to manage all this was pretty ugly and error-prone. So I wrote a collection of classes to bring all these disparate locations for files under a single virtual file-system. For example, to open a config file I could just read 'user/settings.cfg', and to open an image resource I could just read from 'resources/logo.png' and the virtual file-system would do the right thing and return a file-like object.

This turned out to be insanely useful, and I used it for a number of projects. I also used it while working for ICC,  and I had the opportunity to enhance it based on feedback from colleagues. In the last few months I have re-written it from scratch, because I wanted to avoid any copyright issues, but mainly because I could make a better job of it the second time around.

 
 
© 2008 Will McGugan.

A technoblog blog, design by Will McGugan