September 28, 2008 will

So what can you do with FS anyway?

After my last post regarding FS, my file-system abstraction layer for Python, I think I may have left people thinking,"that's nice, but what would you use if for?". Generally, I see it as a way of simplifying file access and exposing only the files you need for your application -- regardless of their physical source. But I can think of a few other uses that may be a littler cooler.

Instant Archive Format

The osfs.MemoryFS class is a filesytem that exists only in memory. You can create files / directories and copy files to it from other filesystem objects, but when you open a file you will get back a StringIO object (from the standard library). Naturaly the files in a MemoryFS are transitory, but as everything is in memory, file access is very fast. I was thinking that since the files and directories are stored as simple Python objects, then a MemoryFS can be pickled -- creating an instant archive format. Uncompressing this archive would simply require unpicking and using the resultant object. I can't see any advantages of this over zip or tar archives, but if you don't have them available, this would be a workable substitute!

Database Filesystem

It wouldn't be particularly challenging to write an FS an object that created a filesystem on a database. Actualy, if using an ORM like SQLAlchemy, SQLObject, Storm or Django's database layer it would be fairly trivial. So what would you use such a thing for? I can think of one use which may make it worth the effort, and that is the ability to store templates for a Python web application in the database, rather than (or in addition to) files. Consider if you will, a situation where the client has requested a change that is absolutely critical to the success of his company and he won't take "it will be in the next release" for an answer, and neither will your project manager. Wouldn't it be nice if you could go in to the Django admin pages and just type in the changes, without making a new release. I'm tempted to implement such a thing, if nobody has done so already!

Proxying a Filesystem

A generic filesystem server could be written that exposes a remote file-system over whatever channel is available. I know there are some standard ways of doing this, but they might not be available on a given platform. I'm thinking of small devices with limited capabilities, or secure environments. The server would serve an FS object, and the client would present the app with a corresponding interface that pulled files and directories over the wire.

There you go. If anyone implements these, let me know!

Use Markdown for formatting
*Italic* **Bold** `inline code` Links to [Google](http://www.google.com) > This is a quote > ```python import this ```
your comment will be previewed here
gravatar
zentux

GOD bless you Will :)
Thanks for your expressive post about FS in detail. It's great !!!

gravatar
Cristi Constantin
Greetings.
One of my projects is a filesystem inside a SQLITE database. The project is called Private-Briefcase (http://code.google.com/p/private-briefcase)

You can add files one by one or in batch, remove files, rename, copy files, export from database to disk.
Each file is compressed with BZ2 or ZLIB and encrypted with PyCrypto AES. All versions of a file are kept, when you add the same file several times. This features can be removed if you think are obsolete. :)
I am interested to join the project with FS. Feel free to contact me.
gravatar
Georgi Kolev
Hey :] Great work with the project..its really useful o/
Anyway, about Proxying a Filesystem… i'm currently doing something like that with a combination of fs.memoryfs and multiprocessing.managers.BaseManager :] This way I can access the file system from separate processes or different computer (BaseManager supports remote access)
Oh thou I have an problem… i wannet to combine fs.memoryfs, multiprocessing.managers.BaseManager and fs.expose.fuse ..the Idea was that i'll mount the memoryfs somewhere and everyone will have access to it (even without using basemanager…) but I guess i'm doing something wrong , ‘cose the mounted filesystem behaves like a fork of the original.. :/

Anyway , here’s a short example;


Server
from fs.memoryfs import MemoryFS
from multiprocessing.managers import BaseManager

class DataManager(BaseManager): pass

class data(object):
def __init__(self):
self.memoryfs = MemoryFS()
def isFile(self, Argument):
if self.memoryfs.isfile(Argument):
return True
return False
def exists(self, Argument):
if self.isDir(Argument) or self.isFile(Argument):
return True
return False
def tree(self, Argument=False):
if not Argument:
tmpResult = []
_walk = self.memoryfs.walk()
while True:
try: tmpResult.append(_walk.next())
except: return tmpResult

tmpResult = self.memoryfs.tree()
return tmpResult
return self.memoryfs.tree()
if self.memoryfs.isdir(Argument):
return self.memoryfs.tree(Argument)
return False
# And so on.... you can get the idea :]
# You can write read/write functions...



dataS = data()
DataManager.register(
'get_memoryfs', callable=lambda:dataS)
datamgr = DataManager(
address = ('0.0.0.0', 50000),
authkey = 'somepassword')
datasrv = datamgr.get_server()
datasrv.serve_forever()

Client(s):
from multiprocessing.managers import BaseManager

class DataManager(BaseManager): pass

DataManager.register('get_memoryfs')
m = DataManager(
address=('127.0.0.1', 50000),
authkey='somepassword')
m.connect()
memoryfs = m.get_memoryfs()

print(memoryfs.tree())


Anyway, i'm sure there are more and even better ways to do it …
p.s.: if anyone has ideas about the fuse problem i've spoke… please share them :]
gravatar
Georgi Kolev
btw are symlinks supported in fs.memoryfs? (if not - will they be?)
gravatar
Will McGugan
Georgi,

I've not actually used multiprocessing, but I don't think you get shared memory for free. Maybe this will help.

There are no plans for symlinks, because it's not supported by all filesystems, and I want to expose only common functionality where possible.

Will
gravatar
Georgi Kolev
Will,
Yes, multiprocessing offers few ways to share data between processes but in most cases you are limited what you can do (in the case with shared C values,arrays) and when you have a lot of data needed to be shared (I have 100+ dictionaries and 50+ lists…and half of them need to be shared.) your code is starting to get to complex… and as you know, simple is always better :) Anyway fs.memoryfs is great and i've dropped a lot of code ‘cose of it :))) again, great work!

About the symlinks, I see your point…. i’m a long time linux used (10+ years) and sometimes I forget that there are file systems out there that still don't support symlinks.

Regards
gravatar
Georgi Kolev
btw here's a little example of what i'm currently doing …if it is usefull to anyone:

here's the server process (witch holds the data):
http://pastebin.com/QxUz8K4h

and here's an example of a client:
http://pastebin.com/NzqXEE8W

Sorry for the double posting …but I think this can be useful to someone :)