I recently administered a programming test to a number of applicants for a Python developer job at 2degreesnetwork.com. Since we now have a new developer (hi Gustavo!), I figured I would post the test and see what kind of response I get to them.
There are two parts to the test, neither are hugely difficult, but there is enough scope in the solutions to understand how the candidate approaches a problem.
In the first part of the test I asked the candidate to look at the following code and implement the thousands_with_commas function, which should take an integer and return a string representation of that integer with commas separating groups of three digits:
def thousands_with_commas(i): return str(i) if __name__ == '__main__': assert thousands_with_commas(1234) == '1,234' assert thousands_with_commas(123456789) == '123,456,789' assert thousands_with_commas(12) == '12'
I think there is a way of doing this with the standard library, and there is also an implementation in Django, but I was looking for a solution from scratch.
It worked quite well as an interview problem, because there is no one obvious solution and there are a few potential gotchas to tackle.
In the second part of the test, I asked the candidate to implement a function that uses a word list to return the anagrams of a given word.
I started them off with the following code:
def annograms(word): words = [w.rstrip() for w in open('WORD.LST')] raise NotImplementedError if __name__ == "__main__": print annograms("train") print '--' print annograms('drive') print '--' print annograms('python')
This part of the test gave a good indication of how much a candidate new about data structures and performance.
You can post code in the comments [code python] like this [/code].
Feel free to post your solutions in the comments, although I suspect I've seen pretty much all variations on possible solutions!
46 Responses to "Python developer programming tests"
Well, I don't know if I'm very original… Just my solution
Did I pass? ;-)
Lately I've been using asking for a flattening function- specifically-
given an arbitrarily nested set of lists, that either have other tuples or a string (literal string in them, not unicode), write a function that does depth then breadth collapsing. Specifically, given-
I give them the option of having flatten as a generator/iterable instead. Further, I tell them they don't have to worry about recursion depth, and that it will never be cyclic. Beyond that, litering a few ‘,’ (just after the “e” for example) is thrown in since some folks trip up on that (why, hell if I know).
The nice thing about this request is that it pretty quickly nails down if someone can do basic recursion, even just thinking about it properly. It also nails down some basic list methods (append/extend primarily) and awareness of isinstance (or type, which is icky, but whatever). For those who know a bit better (but didn't cover this case), I usually then modify one of the elements from a list to a tuple and ask them to make the code generic enough to work with it… leading to hasattr/__iter__ awareness.
The sad thing about this question is that it's not that hard… but a surprisingly small number of candidates seem to get it (or even get close to a solution), sub 25% going by the last few months of interviewing people.
Either way, enjoy- it's a fun one to level during an interview.
Bah… I typo'd the “given an arbitrarily nested” chunk- the block of the text should either refer *just* to tuples, or *just* to lists. As mentioned in the tail end of the comment, I choose one type there just to make the question a bit easier, give folks a starting point.
Good solutions gentlemen.
Brian, something like this?
I'd be more impressed with a candidate that did that without recursion.
Here is my quick & dirty solution without optimizations:
Brian, tell me if this works :)
Just a note that Python 2.6's itertools.chain.from_iterable flattens its arguments into a single iterable.
Not that I'd expect any job applicant to remember that. I can't :)
Actually, I take it back. It doesn't completely recurse. Someone told me it did, but I just experimented and it doesn't.
So have this:
It would be so awesome if Python could one day gain the ability for yields to go all the way back to the caller - removing the embedded for loop.
The answer to #1 is obviously:
if i == 1234:
return ‘1,234’
elif i == 123456789)
return ‘123,456,789’
elif i == 12
return ‘12’
The answer to #1 is obviously:
Update: Fixed code formatting
There is something a little clumsy about loop, guess it is ‘explicit’ though.
TDD, I think you spotted a flaw in Test Driven Design.
@will/@rolando/@richard:
Guessing y'all hacked that one out on the interpretter… my first response was “strings are iterable”, but for some goofy ass reason strings lack an __iter__ method (but do support iteration). I'm a bit curious why they don't expose __iter__ to introspection, but so it goes (python has some weird warts).
Learn something everyday I guess (fortunately the two folk out of 9 who got close enough to the answer did termination testing first). Other nice thing about this question is that it gives you a window into discussing optimization/performance, intermediate objects generated (and minimizing those), etc. Really kind of a fun, simple question to throw at candidates.
As for the folk asking for a non recursive version, look in http://www.pkgcore.org/trac/pkgcore/browser/snakeoil/snakeoil/lists.py [pkgcore.org] for native_iflatten_* functions. It shoves the stack itself off onto basically an iterable that is internally a deque of other iterables. Think chain, just prependable/appendable after the fact.
Note that those python implementations probably are slower in most cases then just doing a recursive generator- they were implemented that way to be equivalent to the c implementation (http://www.pkgcore.org/trac/pkgcore/browser/snakeoil/src/lists.c [pkgcore.org]) which, at least at the time of writing, was pretty licketly split fast for our usage in pkgcore.
Here are a few attempts at the thousands with commas issues. Not super happy about either of the solutions, but here they are.
The first example reverses the list, and inserts a comma after every 3 items, adjusting for commas. The second solution appends items to a list, and after three items, it adds it to a larger list, then reduces that nested list into a string.
Variation on the thousands…
@rolando: your anagrams function at first glance will fail on word ‘loop’, w/ word list having ‘poll’. the chars used are the same, len is the same, but the # of chars to work with varies.
another variation, since I'm a bit bored atm-
Nice thing about that version of the solution is it doesn't beat on memory- if the word list is massive (GB), it'll run with it, and if the source word is reasonably massive (say 1024 chars) it doesn't allocate a meg for the potential permutations to intersect against.
That said, the sorted is a bit of hack, and probably not the fastest approach. Simple though.
Here's my solution for “thousand_with_commas”. It's a bit verbose compared to others but ought to work:
Well, just like asked, a non-recursive solution… It uses a stack, which may be considered cheating. ;-)
My take:
Alberto
I first thought about the modulo solution to thousands, but then went with the reversed string approach like so:
I think this reads easy, and for that reason I also like Brian Harring's straight list approach above. I definitely found a couple corners cases playing around with this one, and can see how it makes a good interview question.
For annograms, I went for an implementation that emphasizes speed, especially considering that the starting code suggests the entire word list will fit into memory. The word list is pre-processed and stored into a dict for rapid lookup using the sorted hash. I use a WordPlay class instance to hold and reuse the word dictionary.
Ok I know thats excatly not what you would excpect in a job interview :-)
But here is the long version I would have delivered:
The annograms one:
A simple, easy to read loop and a slightly iffy list comprehension:
Anagrams
Huub900's anagram answer is fantastic.
Would this pass? It was the first thing I came up with, but I acknowledge that a very, very large number would blow the stack, and it's probably not very pythonic:
The spec for thousands_with_commas says that it should accept an integer (it doesn't say positive integer).
annograms() can be turned into a generator by changing ‘filter’ to ‘ifilter’
And because a promise is a debt, one with regular expressions!
I think it's more readable and not too different than other solutions as far as performance is concerned.
I've already seen my solutions for the other ones on here, so I won't bother, but here's my flatten function:
This should, in theory, handle any iterable item, since it just tries to iterate, and catches the exception if it can't.
Paul,
What about this?
Will
Here's my attempt at non-recursive flatten
Here is what I came up with
I'd have naturally gone with a string reversing approach, but folks above have already done that. I was thinking that it would be nice to avoid the two reverses, perhaps by imagining that ‘1234’ is split into ‘xx1’ and ‘234’ where ‘x’ is the non-existing stuff before the start of the string.
That gave rise to my first time I can recall where I've used modulo with a negative right operand. It gives just the starting indexes required. Here, Python's use of an index of -2 as len(s) - 2 happens to be unwanted, hence the max() to clamp it to 0. I've also added more test cases, including for some big numbers.
Perhaps someone may like to gather the various solutions here and test and time them?
If using regexp… why not the simpler:
“Me”: That's going to start matching with the very first digit, and stride forward three digits at a time until it finds out if it's at the end of the string. Whether it matches then or not, it's then going to do exactly the same starting with the second digit, striding forward three at a time.
Clearly, after putting in a comma, there's no need to try and match starting with the first or second digit after that comma. For 2 ** 7 ** 7 that may be a worthwhile saving. :-)
“Ralph Corderoy”: You're right… however also this one
is still quite slow (even compiling the regexp just once)… so if speed is your concern then regexp is probably not the tool for this job (and if you're at this microptimization level probably python is IMO a questionable choice in general).
Moreover I think that should be possible to write a regexp engine that can run the first version without being O(n^2)
“Me”: I agree that Python's re module isn't the way to go if speed is an issue. But then my code above didn't use re. :-) Also, I think you're right that an RE engine could have the ability to avoid the O(n**2), but if we're writing that, we could more easier write thousands_with_commas() instead. :-)
This flattens without recursion and uses lists without performance penalties:
This is more than 2x faster and simpler than previous version:
A little bit faster version:
This version doesn't use library functions (so I wrote it without referring to docs, just like I'm on an interview :) ), but it's just as fast as Jaime's version from the first comment (I ran timeit's on a couple of versions posted here, and it turned out that first one was the fastest). Perhaps, it should be tweaked to use generators to decrease memory footprint, but I just optimized for speed and readability.
Regarding thousands with commas: it's so super easy, I'm surprised that only mj's solution uses recursion. Sure, it crashes on ridiculously long numbers, but on the other hand, it's not what it's supposed to be used for, right?
You could go on to review any submission for performance, but the question as you state it seems to ask for just a correct answer.
It would be better to follow up with questions on code maintenance rather than performance if you didn't mention performance when posing the original questions.
- Paddy.
My easy to read version
Hi guys,
I am learning Python at the moment and will apply for a future jobs; however, Python is a secondary language. What kind of questions would you ask a new Python developer?
Thanks
A wise group once said:
“ It's very tempting to use these ‘cool’ features when they're not absolutely necessary. It's harder to read, understand, and debug code that's using unusual features underneath. It doesn't seem that way at first (to the original author), but when revisiting the code, it tends to be more difficult than code that is longer but is straightforward. ”
I would have followed the above if I were hiring.
Just keep it simple.
You don't really need to put everything on a single line.
This was fun.