I recently administered a programming test to a number of applicants for a Python developer job at 2degreesnetwork.com. Since we now have a new developer (hi Gustavo!), I figured I would post the test and see what kind of response I get to them.

There are two parts to the test, neither are hugely difficult, but there is enough scope in the solutions to understand how the candidate approaches a problem.

In the first part of the test I asked the candidate to look at the following code and implement the thousands_with_commas function, which should take an integer and return a string representation of that integer with commas separating groups of three digits:

def thousands_with_commas(i):

    return str(i)


if __name__ == '__main__':

    assert thousands_with_commas(1234) == '1,234'
    assert thousands_with_commas(123456789) == '123,456,789'
    assert thousands_with_commas(12) == '12'

I think there is a way of doing this with the standard library, and there is also an implementation in Django, but I was looking for a solution from scratch.

It worked quite well as an interview problem, because there is no one obvious solution and there are a few potential gotchas to tackle.

In the second part of the test, I asked the candidate to implement a function that uses a word list to return the anagrams of a given word.

I started them off with the following code:

def annograms(word):

    words = [w.rstrip() for w in open('WORD.LST')]

    raise NotImplementedError


if __name__ == "__main__":

    print annograms("train")
    print '--'
    print annograms('drive')
    print '--'
    print annograms('python')

This part of the test gave a good indication of how much a candidate new about data structures and performance.

You can post code in the comments [code python] like this [/code].

Feel free to post your solutions in the comments, although I suspect I've seen pretty much all variations on possible solutions!

This blog post was posted to It's All Geek to Me on Monday December 21st, 2009 at 7:30PM
 

48 Responses to "Python developer programming tests"

  • Jaime
    December 21st, 2009, 9:20 p.m.

    Well, I don't know if I'm very original… Just my solution

    import itertools

    def thousands_with_commas(i):
    GROUP_VALUE = 1000

    #Get the values in groups of thousands
    value = i
    groups = []
    while(value % GROUP_VALUE):
    groups.append((value % GROUP_VALUE))
    value = value // GROUP_VALUE

    #Reverse, change to str and add commas
    conv = [str(j) for j in reversed(groups) ]
    return ','.join(conv)


    def annograms(word):

    words = [w.rstrip() for w in open('WORD.LST')]
    #Reduce the number of words to compare to
    # just the ones with the same lengths
    # Also make a set to avoid duplicates
    words = set([ w for w in words if len(w) == len(word)])

    #Find all possible anagrams
    comb = set([ ''.join(w) for w in
    itertools.permutations(word, len(word))])

    #Now find the ones that are words intersecting two sets
    return comb.intersection(words)


    if __name__ == '__main__':

    assert thousands_with_commas(1234) == '1,234'
    assert thousands_with_commas(123456789) == '123,456,789'
    assert thousands_with_commas(12) == '12'


    print annograms("train")
    print '--'
    print annograms('drive')
    print '--'
    print annograms('python')

    Did I pass? ;-)

  • December 21st, 2009, 9:28 p.m.
    from collections import defaultdict
    def thousands_with_commas(i):
    si=str(i)[::-1]
    return (','.join(si[i:i+3] for i in range(0,len(si),3)))[::-1]

    def annograms(word):
    words_by_anagram=defaultdict(set)
    for word in [w.rstrip() for w in open('WORD.LST')]:
    words_by_anagram[''.join(sorted(word))].add(word)
    return words_by_anagram[word]
  • December 21st, 2009, 9:28 p.m.
    # This is the standard Z fixed point combinator, often mistaken to be the
    # Y combinator, which can't, AFAIK, be expressed in Python
    Z = lambda g: (lambda f: g(lambda arg: f(f)(arg))) (lambda f: g(lambda arg: f(f)(arg)))

    insert_thousand_sign = lambda s: \
    lambda f: \
    lambda a: str(a) if a < 1000 \
    else (lambda (d, m): '%s%s%03d' % (f(d), s, m))(
    divmod(a, 1000))

    thousands_with_commas = Z(insert_thousand_sign(','))

    if __name__ == '__main__':
    tests = (
    (1234, '1,234'),
    (123456789, '123,456,789'),
    (12, '12'),
    )

    for (v, r) in tests:
    print '%d: %s' % (v, thousands_with_commas(v))
    assert thousands_with_commas(v) == r
  • Brian Harring
    December 21st, 2009, 9:51 p.m.

    Lately I've been using asking for a flattening function- specifically-

    given an arbitrarily nested set of lists, that either have other tuples or a string (literal string in them, not unicode), write a function that does depth then breadth collapsing. Specifically, given-

    seq = [["a"], "b", [["c", "d"], "e",], "f"]
    assert flatten(seq) == ['a', 'b', 'c', 'd', 'e', 'f']

    I give them the option of having flatten as a generator/iterable instead. Further, I tell them they don't have to worry about recursion depth, and that it will never be cyclic. Beyond that, litering a few ‘,’ (just after the “e” for example) is thrown in since some folks trip up on that (why, hell if I know).

    The nice thing about this request is that it pretty quickly nails down if someone can do basic recursion, even just thinking about it properly. It also nails down some basic list methods (append/extend primarily) and awareness of isinstance (or type, which is icky, but whatever). For those who know a bit better (but didn't cover this case), I usually then modify one of the elements from a list to a tuple and ask them to make the code generic enough to work with it… leading to hasattr/__iter__ awareness.

    The sad thing about this question is that it's not that hard… but a surprisingly small number of candidates seem to get it (or even get close to a solution), sub 25% going by the last few months of interviewing people.

    Either way, enjoy- it's a fun one to level during an interview.

  • Brian Harring
    December 21st, 2009, 9:52 p.m.

    Bah… I typo'd the “given an arbitrarily nested” chunk- the block of the text should either refer *just* to tuples, or *just* to lists. As mentioned in the tail end of the comment, I choose one type there just to make the question a bit easier, give folks a starting point.

  • December 21st, 2009, 10:23 p.m.

    Good solutions gentlemen.

    Brian, something like this?

    def flatten(l):
    def recurse():
    for i in l:
    if hasattr(i, '__iter__'):
    for ii in flatten(i):
    yield ii
    else:
    yield i
    return list(recurse())

    I'd be more impressed with a candidate that did that without recursion.

  • December 21st, 2009, 10:44 p.m.

    Here is my quick & dirty solution without optimizations:

    def thousands_with_commas(i):
    #return str(i)
    rnum = list()
    for (counter,number) in enumerate(reversed(str(i))):
    rnum.append(number)
    if (counter+1) % 3 == 0 and not (counter+1) == len(str(i)):
    rnum.append(',')
    return ''.join(reversed(rnum))

    if __name__ == '__main__':
    assert thousands_with_commas(1234) == '1,234'
    assert thousands_with_commas(123456789) == '123,456,789'
    assert thousands_with_commas(12) == '12'
    def annograms(word):
    words = [w.rstrip() for w in open('WORD.LST')]
    #raise NotImplementedError
    chars = set(word)
    annograms = list()
    for w in words:
    if set(w) == chars and not w == word and len(w) == len(word):
    annograms.append(w)
    return annograms

    if __name__ == "__main__":
    print annograms("train")
    print '--'
    print annograms('drive')
    print '--'
    print annograms('python')

    Brian, tell me if this works :)

    def flatten(l):
    #raise NotImplementedError
    if hasattr(l, '__iter__'):
    flat = list()
    for e in l:
    flat.extend(flatten(e))
    return flat
    else:
    return [l]


    if __name__ == '__main__':

    seq = [("b",),[("e",),],"f"]

    assert flatten(seq) == ['b', 'e', 'f']
  • Richard Jones
    December 21st, 2009, 11:26 p.m.

    Just a note that Python 2.6's itertools.chain.from_iterable flattens its arguments into a single iterable.

    Not that I'd expect any job applicant to remember that. I can't :)

  • Richard Jones
    December 21st, 2009, 11:38 p.m.

    Actually, I take it back. It doesn't completely recurse. Someone told me it did, but I just experimented and it doesn't.

    So have this:

    >>> def flatten(iterable):
    ... for elem in iterable:
    ... if hasattr(elem, '__iter__'):
    ... for sub in flatten(elem): yield sub
    ... else:
    ... yield elem
    ...
    >>> list(flatten([[1,2,3],[[2,3],4],[4,5,6]]))
    [1, 2, 3, 2, 3, 4, 4, 5, 6]

    It would be so awesome if Python could one day gain the ability for yields to go all the way back to the caller - removing the embedded for loop.

  • TDD Expert
    December 21st, 2009, 11:39 p.m.

    The answer to #1 is obviously:

    if i == 1234:

    return ‘1,234’

    elif i == 123456789)

    return ‘123,456,789’

    elif i == 12

    return ‘12’

  • TDD Expert
    December 21st, 2009, 11:40 p.m.

    The answer to #1 is obviously:

    if i == 1234:
    return '1,234'
    elif i == 123456789):
    return '123,456,789'
    elif i == 12:
    return '12'

    Update: Fixed code formatting

  • December 21st, 2009, 11:52 p.m.

    There is something a little clumsy about loop, guess it is ‘explicit’ though.

    TDD, I think you spotted a flaw in Test Driven Design.

  • Brian Harring
    December 22nd, 2009, 2:41 a.m.

    @will/@rolando/@richard:

    Guessing y'all hacked that one out on the interpretter… my first response was “strings are iterable”, but for some goofy ass reason strings lack an __iter__ method (but do support iteration). I'm a bit curious why they don't expose __iter__ to introspection, but so it goes (python has some weird warts).

    Learn something everyday I guess (fortunately the two folk out of 9 who got close enough to the answer did termination testing first). Other nice thing about this question is that it gives you a window into discussing optimization/performance, intermediate objects generated (and minimizing those), etc. Really kind of a fun, simple question to throw at candidates.

    As for the folk asking for a non recursive version, look in http://www.pkgcore.org/trac/pkgcore/browser/snakeoil/snakeoil/lists.py [pkgcore.org] for native_iflatten_* functions. It shoves the stack itself off onto basically an iterable that is internally a deque of other iterables. Think chain, just prependable/appendable after the fact.

    Note that those python implementations probably are slower in most cases then just doing a recursive generator- they were implemented that way to be equivalent to the c implementation (http://www.pkgcore.org/trac/pkgcore/browser/snakeoil/src/lists.c [pkgcore.org]) which, at least at the time of writing, was pretty licketly split fast for our usage in pkgcore.

  • December 22nd, 2009, 6:37 a.m.

    Here are a few attempts at the thousands with commas issues. Not super happy about either of the solutions, but here they are.

    The first example reverses the list, and inserts a comma after every 3 items, adjusting for commas. The second solution appends items to a list, and after three items, it adds it to a larger list, then reduces that nested list into a string.

    def thousands_with_commas(i):
    total_commas = 0
    ints = list(str(i))
    rev_ints = ints[::-1]
    rev_ints_enum = list(rev_ints)
    for i, char in enumerate(rev_ints_enum):
    if i % 3 == 0 and i > 0:
    rev_ints.insert(i + total_commas, ',')
    total_commas += 1
    return ''.join(rev_ints[::-1])

    def thousands_with_commas2(i):
    groups = []
    inner_group = []
    rev = list(str(i))[::-1]
    for count, num in enumerate(rev):
    if not count % 3 and count != 0:
    groups.append(inner_group[::-1])
    inner_group = list()
    inner_group.append(num)
    groups.append(inner_group[::-1])
    rev_groups = groups[::-1] # reverse
    result_list = []
    for csg in rev_groups:
    result_list.append(''.join(csg))
    return ','.join(result_list)

    if __name__ == '__main__':
    assert thousands_with_commas2(1234) == '1,234'
    assert thousands_with_commas2(123456789) == '123,456,789'
    assert thousands_with_commas2(12) == '12'

    assert thousands_with_commas(1234) == '1,234'
    assert thousands_with_commas(123456789) == '123,456,789'
    assert thousands_with_commas(12) == '12'
  • Brian Harring
    December 22nd, 2009, 7:19 a.m.

    Variation on the thousands…

    def thousands_with_comma(i):
    s, l = str(i), []
    skip = len(s) % 3
    if skip:
    l.append(s[:skip])
    for start in xrange(skip, len(s), 3):
    l.append(s[start:start + 3])
    return ','.join(l)

    @rolando: your anagrams function at first glance will fail on word ‘loop’, w/ word list having ‘poll’. the chars used are the same, len is the same, but the # of chars to work with varies.

    another variation, since I'm a bit bored atm-

    def anagrams(word, word_src='WORD.LST'):
    """assumes word_src has no duplicates; find anagram combinations from word..."""
    word = sorted(word)
    for potential in open(word_src):
    potential = potential.strip()
    if sorted(potential) == word:
    yield potential

    Nice thing about that version of the solution is it doesn't beat on memory- if the word list is massive (GB), it'll run with it, and if the source word is reasonably massive (say 1024 chars) it doesn't allocate a meg for the potential permutations to intersect against.

    That said, the sorted is a bit of hack, and probably not the fastest approach. Simple though.

  • December 22nd, 2009, 7:22 a.m.

    Here's my solution for “thousand_with_commas”. It's a bit verbose compared to others but ought to work:

    def split_len(seq, length):
    # [url]http://code.activestate.com/recipes/496784/[/url]
    return [seq[i:i+length] for i in range(0, len(seq), length)]

    def thousands_with_commas(num):
    segment_len = 3
    separator = ','
    num_str = str(num)

    # figure out the size of the first segment
    first_size = len(num_str) % segment_len or segment_len

    # construct segments
    segments = [num_str[:first_size], ]
    segments.extend(split_len(num_str[first_size:], segment_len))

    return separator.join(segments)
  • Jaime
    December 22nd, 2009, 8:58 a.m.

    Well, just like asked, a non-recursive solution… It uses a stack, which may be considered cheating. ;-)

    def is_iterable(element):
    if hasattr(element, '__iter__'):
    return True
    else:
    return False

    def flatten(array):
    result = []

    #In case the array is not iterable, return it
    if not is_iterable(array):
    return array

    #Use an stack (maybe it's cheat). The stack stores the
    # iterators of the last called elements
    stack = [ array.__iter__() ]
    #Get the element from the last iterator
    element = stack[0].next()
    while (element):
    #Check iterability
    if is_iterable(element):
    #Store on the stack
    stack.append(element.__iter__())
    else:
    result.append(element)

    #Get the next result, which will be
    # the next iteration on the last stacked value
    element = None
    while not element:
    try:
    element = next(stack[-1])
    except StopIteration:
    #Get the previous element on the stack
    #on next iteration of while loop
    stack.pop()

    if not stack:
    #Empty stack, exit and return filled
    #flatten array
    return result




    if __name__ == '__main__':

    seq = [("b",), [("e",), ], "f"]

    assert flatten(seq) == ['b', 'e', 'f']
    assert flatten('a') == 'a'
  • Alberto
    December 22nd, 2009, 9:24 a.m.

    My take:

    def thousands_with_commas(i):
    groups = []
    while i:
    i, m = divmod(i, 1000)
    groups.append(m)
    return ','.join(map(str, reversed(groups)))

    Alberto

  • cybersol
    December 22nd, 2009, 9:36 a.m.

    I first thought about the modulo solution to thousands, but then went with the reversed string approach like so:

    def thousands_with_commas(i):
    str_i = str(i)
    reverse_i = str_i[::-1]
    num_groups = (len(str_i)+2)/3
    by_threes = [reverse_i[3*x:3*(x+1)] for x in xrange(num_groups)]
    reverse_out = ','.join(by_threes)
    return ''.join(reversed(reverse_out))

    I think this reads easy, and for that reason I also like Brian Harring's straight list approach above. I definitely found a couple corners cases playing around with this one, and can see how it makes a good interview question.

    For annograms, I went for an implementation that emphasizes speed, especially considering that the starting code suggests the entire word list will fit into memory. The word list is pre-processed and stored into a dict for rapid lookup using the sorted hash. I use a WordPlay class instance to hold and reuse the word dictionary.

    import collections
    class WordPlay(object):
    def __init__(self, infile="WORD.LST"):
    self.worddict = collections.defaultdict(list)
    for w in open(infile):
    s = w.strip()
    self.worddict[tuple(sorted(s))].append(s)

    def annograms(self, word):
    ws = tuple(sorted(word))
    if ws in self.worddict:
    return self.worddict[ws]
    else:
    return []

    if __name__ == "__main__":
    wp = WordPlay()
    annograms = wp.annograms
    print annograms("train")
  • Gregor Müllegger
    December 22nd, 2009, 9:47 a.m.

    Ok I know thats excatly not what you would excpect in a job interview :-)

    thousands_with_commas = lambda i: ''.join([',' + c if i and i % 3 == 0 else c for i,c in enumerate(str(i)[::-1])])[::-1]

    But here is the long version I would have delivered:

    def thousands_with_commas(i):
    s = str(i)
    output = ''
    for i,c in enumerate(s[::-1]):
    if i != 0 and i % 3 == 0:
    output += ','
    output += c
    return output[::-1]
  • Alberto
    December 22nd, 2009, 9:50 a.m.

    The annograms one:

    def annograms(word, words=None):
    words = words or [w.strip() for w in open('WORDS.LST')]
    # lowest cost checks first
    return [w for w in words if len(w) == len(word) and
    set(w) == set(word) and
    tuple(word) in permutations(w)]
  • Huub900
    December 22nd, 2009, 10:55 a.m.
    def anagrams(word):
    words = [w.rstrip() for w in open('WORD.LST')]
    sword = sorted(word)
    return [w for w in words if sorted(w) == sword]
  • December 22nd, 2009, 1:20 p.m.

    A simple, easy to read loop and a slightly iffy list comprehension:

    def thousands_with_commas(i):
    s = list(str(i))

    c = 0
    for x in range( len(s)-1, 0, -1 ):
    c += 1
    if c % 3 == 0:
    s.insert(x, ',')

    return "".join(s)


    def with_commas2(i):

    s = str(i)[::-1]
    return "".join( [ s[i:i+3] + ',' for i in range(0,len(s),3) ] )[::-1][1:]


    if __name__ == '__main__':

    assert thousands_with_commas(1234) == '1,234'
    assert thousands_with_commas(123456789) == '123,456,789'
    assert thousands_with_commas(12) == '12'

    assert with_commas2(1234) == '1,234'
    assert with_commas2(123456789) == '123,456,789'
    assert with_commas2(12) == '12'
  • December 22nd, 2009, 1:56 p.m.

    Anagrams

    import sets

    def annograms(word):
    wl = len(word)
    word = sets.Set(word)

    words = [w.rstrip() for w in open('WORD.LST')]

    result = []
    for w in words:
    if len(w) == wl:
    if len( sets.Set(w).intersection(word) ) == wl:
    result.append(w)

    return result

    if __name__ == "__main__":

    print annograms("train")
    print '--'
    print annograms('drive')
    print '--'
    print annograms('python')

    Huub900's anagram answer is fantastic.

  • December 22nd, 2009, 6:28 p.m.
    import itertools

    # In an interpreter with a real GC, this might leak an FD...
    annograms = lambda word: map(lambda c: ''.join(c),
    set(itertools.permutations(word)).intersection(
    tuple(w.rstrip()) for w in open('WORD.LST')))
  • mj
    December 22nd, 2009, 9:32 p.m.

    Would this pass? It was the first thing I came up with, but I acknowledge that a very, very large number would blow the stack, and it's probably not very pythonic:

    def thousands_with_commas(i):
    num = str(i)
    if len(num) <= 3:
    return num
    else:
    return thousands_with_commas(num[0:len(num) - 3]) + "," + num[-3:]
  • Mike
    December 22nd, 2009, 10:58 p.m.

    The spec for thousands_with_commas says that it should accept an integer (it doesn't say positive integer).

    def thousands_with_commas(i):
    if i < 0:
    return '-' + thousands_with_commas(-i)

    else:
    i,r = divmod(i,1000)
    s = [str(r)]
    while i>0:
    i,r = divmod(i,1000)
    s.append(str(r))

    return ','.join(reversed(s))

    annograms() can be turned into a generator by changing ‘filter’ to ‘ifilter’

    from string import strip
    from itertools import imap

    def annograms(word):
    letters = sorted(word)
    sz = len(word)

    def check(w):
    return len(w) == sz and sorted(w) == letters

    return filter(check,imap(strip,open('WORD.LST')))
  • December 22nd, 2009, 11:44 p.m.

    And because a promise is a debt, one with regular expressions!

    import re

    _INTEGER_RE = re.compile(r"^(?P<start>[\d]{1,3}?)(?P<thousands>(\d{3})*)$")
    _THOUSANDS_RE = re.compile(r"\d{3}")

    def thousands_with_commas(i):
    match = _INTEGER_RE.match(str(i))
    start = match.group("start")
    thousands = _THOUSANDS_RE.findall(match.group("thousands"))
    return ",".join([start] + thousands)

    if __name__ == '__main__':

    assert thousands_with_commas(1234) == '1,234'
    assert thousands_with_commas(123456789) == '123,456,789'
    assert thousands_with_commas(12) == '12'
    print "You got the job! Not really, it's too late now :)"

    I think it's more readable and not too different than other solutions as far as performance is concerned.

  • January 4th, 2010, 7:31 p.m.

    I've already seen my solutions for the other ones on here, so I won't bother, but here's my flatten function:

    def flatten_list(l):
    try:
    for item in l:
    for i in flatten_list(item): yield i
    except:
    yield l

    This should, in theory, handle any iterable item, since it just tries to iterate, and catches the exception if it can't.

  • January 5th, 2010, 6:54 p.m.

    Paul,

    What about this?

    >>> list(flatten_list(['foo', 'bar']))

    Will

  • Joe Sixpack
    January 7th, 2010, 6:55 p.m.

    Here's my attempt at non-recursive flatten

    def flatten(l):
    l = list(l)
    i = 0
    while True:
    if i == len(l): break
    if isinstance(l[i], basestring):
    i += 1
    continue
    try:
    l[i:i+1] = l[i]
    except:
    i += 1
    return l

    L = [('Launching',), 5, 4, ([],), [3,2,1, 'blast off'],()]
    print(flatten(L))
    # ['Launching', 5, 4, 3, 2, 1, 'blast off']
  • Harry
    January 18th, 2010, 11:57 p.m.

    Here is what I came up with

    def ts(y):
    return ','.join(re.findall('\d{0,3}', str(y)[::-1]))[::-1].strip(',')
  • Ralph Corderoy
    February 1st, 2010, 2:10 p.m.

    I'd have naturally gone with a string reversing approach, but folks above have already done that. I was thinking that it would be nice to avoid the two reverses, perhaps by imagining that ‘1234’ is split into ‘xx1’ and ‘234’ where ‘x’ is the non-existing stuff before the start of the string.

    That gave rise to my first time I can recall where I've used modulo with a negative right operand. It gives just the starting indexes required. Here, Python's use of an index of -2 as len(s) - 2 happens to be unwanted, hence the max() to clamp it to 0. I've also added more test cases, including for some big numbers.

    Perhaps someone may like to gather the various solutions here and test and time them?

    #! /usr/bin/python

    def thousands_with_commas(i):
    if i < 0:
    return '-' + thousands_with_commas(-i)
    s = str(i)
    l = len(s)
    return ','.join((s[max(n, 0):n + 3] for n in xrange(l % -3, l, 3)))

    if __name__ == '__main__':
    assert thousands_with_commas(1234) == '1,234'
    assert thousands_with_commas(123456789) == '123,456,789'
    assert thousands_with_commas(12) == '12'

    for i, o in (
    (0, '0'),
    (1, '1'),
    (12, '12'),
    (123, '123'),
    (1234, '1,234'),
    (12345, '12,345'),
    (123456, '123,456'),
    (1234567, '1,234,567'),
    ):
    assert thousands_with_commas(i) == o
    if i:
    assert thousands_with_commas(-i) == '-' + o

    print 'Pass so far. This next bit can take a while...'

    import re
    for n in range(8):
    n = 2 ** n ** n
    s = thousands_with_commas(n)
    assert re.match(r'^\d{1,3}(?:,\d{3})*$', s)
    s = s.replace(',', '')
    assert int(s) == n

    print 'All pass.'
  • Me
    February 6th, 2010, 2:17 p.m.

    If using regexp… why not the simpler:

    import re

    def thousands_with_commas(i):
    return re.sub(r'(\d)(?=(\d\d\d)+$)', r'\1,', str(i))
  • Ralph Corderoy
    February 6th, 2010, 3:32 p.m.

    “Me”: That's going to start matching with the very first digit, and stride forward three digits at a time until it finds out if it's at the end of the string. Whether it matches then or not, it's then going to do exactly the same starting with the second digit, striding forward three at a time.

    Clearly, after putting in a comma, there's no need to try and match starting with the first or second digit after that comma. For 2 ** 7 ** 7 that may be a worthwhile saving. :-)

  • Me
    February 9th, 2010, 6:42 p.m.

    “Ralph Corderoy”: You're right… however also this one

    def thousands_with_commas(i):
    return re.sub(r'(\d\d\d)(?=\d)', r'\1,',
    str(i)[::-1])[::-1]

    is still quite slow (even compiling the regexp just once)… so if speed is your concern then regexp is probably not the tool for this job (and if you're at this microptimization level probably python is IMO a questionable choice in general).

    Moreover I think that should be possible to write a regexp engine that can run the first version without being O(n^2)

  • Ralph Corderoy
    February 11th, 2010, 6:03 p.m.

    “Me”: I agree that Python's re module isn't the way to go if speed is an issue. But then my code above didn't use re. :-) Also, I think you're right that an RE engine could have the ability to avoid the O(n**2), but if we're writing that, we could more easier write thousands_with_commas() instead. :-)

  • Mars
    February 16th, 2010, 5:26 a.m.

    This flattens without recursion and uses lists without performance penalties:

    def flatten(l):
    if isinstance(l, basestring):
    return l

    # get TypeError if not iterable
    iter(l)

    rs = []
    while l:
    last = l.pop()
    if isinstance(last, basestring):
    rs.append(last)
    continue
    try:
    iter(last)
    l.extend(last)
    except TypeError:
    rs.append(last)

    rs.reverse()
    return rs
  • Mars
    February 16th, 2010, 8:56 a.m.

    This is more than 2x faster and simpler than previous version:

    def flatten(l):
    if isinstance(l, basestring):
    return l

    # raise TypeError if not iterable
    iter(l)

    rs = []
    while l:
    last = l.pop()
    if hasattr(last, '__iter__'):
    l.extend(last)
    else:
    rs.append(last)

    rs.reverse()
    return rs
  • Mars
    February 16th, 2010, 9:13 a.m.

    A little bit faster version:

    def flatten(l):
    if isinstance(l, basestring):
    return l

    # raise TypeError if not iterable
    iter(l)

    rs = []
    try:
    while True:
    last = l.pop()
    if hasattr(last, '__iter__'):
    l.extend(last)
    else:
    rs.append(last)
    except IndexError:
    pass

    rs.reverse()
    return rs
  • Nick Lutsiuk
    June 14th, 2010, 10:37 p.m.

    This version doesn't use library functions (so I wrote it without referring to docs, just like I'm on an interview :) ), but it's just as fast as Jaime's version from the first comment (I ran timeit's on a couple of versions posted here, and it turned out that first one was the fastest). Perhaps, it should be tweaked to use generators to decrease memory footprint, but I just optimized for speed and readability.

    def annograms(word):
    words = [w.rstrip() for w in open('WORD.LST')]
    length, searched = len(word), sorted(word)
    words = [w for w in words if len(w) == length]
    return [w for w in words if w != word and sorted(w) == searched]

    Regarding thousands with commas: it's so super easy, I'm surprised that only mj's solution uses recursion. Sure, it crashes on ridiculously long numbers, but on the other hand, it's not what it's supposed to be used for, right?

    def twc(s):
    s = str(s)
    return twc(s[:-3]) + "," + s[-3:] if len(s) > 3 else s
  • January 15th, 2011, 12:45 p.m.

    You could go on to review any submission for performance, but the question as you state it seems to ask for just a correct answer.

    It would be better to follow up with questions on code maintenance rather than performance if you didn't mention performance when posing the original questions.

    - Paddy.

  • September 20th, 2011, 5:15 p.m.

    My easy to read version

    def thousands_with_commas(i):
    reversed = str(i)[::-1]
    fnum = ""
    for i in range(0,len(reversed),3):
    fnum += reversed[i:i+3] + ','
    return fnum[::-1][1:]
  • George
    November 22nd, 2011, 2:52 p.m.

    Hi guys,

    I am learning Python at the moment and will apply for a future jobs; however, Python is a secondary language. What kind of questions would you ask a new Python developer?

    Thanks

  • Val
    August 2nd, 2012, 1:31 p.m.

    A wise group once said:


    “ It's very tempting to use these ‘cool’ features when they're not absolutely necessary. It's harder to read, understand, and debug code that's using unusual features underneath. It doesn't seem that way at first (to the original author), but when revisiting the code, it tends to be more difficult than code that is longer but is straightforward. ”

    I would have followed the above if I were hiring.
    Just keep it simple.

    You don't really need to put everything on a single line.

  • Sky
    January 15th, 2013, 5:41 p.m.

    This was fun.

    def thousands_with_commas(i):
        i = str(i)
        returnVal = ""
        end = len(i) - 1
        for charId in range(len(i)):
            posFromEnd = (len(i) - 1) - charId
            char = i[charId]
            if posFromEnd %3 == 0 and len(i) > 3 and charId != end:
                returnVal += ( char + ",")
            else:
                returnVal += char
        return str(returnVal)
    

  • June 29th, 2013, 5:35 p.m.

    Here's mine - I'm pretty new to Python so this is going to be too ABC for a job but it seems to work fine.

    #!/usr/bin/python
    # programming test to put commas into numbers correctly
    # input is an integer so no need to worry about decimal points etc

    def thousands_with_commas(num):
    string = str(num);
    numcommas = (len(string)-1) / 3;
    outstring = '';
    i = 0;
    while i < numcommas:
    outstring = ',' + string[(len(string) - ((i+1) * 3)):len(string) - (i*3)] + outstring;
    i += 1;
    if i * 3 < len(string): outstring = string[0:(len(string)%3)] + outstring;
    if (i + 1) * 3 == len(string): outstring = string[0:3] + outstring;
    return outstring;

    print thousands_with_commas(1234);
    print thousands_with_commas(123456789);
    print thousands_with_commas(12);

  • Cyril B
    August 2nd, 2013, 4:58 p.m.

    Hey guys,
    I loved reading your posts.

    Does anyone of you know of reliable online test to evaluate a :
    - Python trainee,
    - Python Jr,
    - Python Sr,
    - Python CTO ?

    The goal is to identify relevant candidates to implement / enrich OpenEdX.

    Thank you all in advance for your answers.
    Best,
    Cyril

Leave a Comment

You can use bbcode in the comment: e.g. [b]This is bold[/b], [url]http://www.willmcgugan.com[/url], [code python]import this[/code]
Preview Posting...
Previewing comment, please wait a moment...

My Tweets

Will McGugan

My name is Will McGugan. I am an unabashed geek, an author, a hacker and a Python expert – amongst other things!

Search for Posts
Possibly related posts
Tags
Popular Tags
 
Archives
2013
 
Recent Comments
Nice PicsI saw all 25 Dragon Fly Picsin Flicr
- S I M H A on Dragon Fly
Hey Will thanks for the information. I will keep that in mind.
Hey Will thanks for the information. I will keep that in mind.
Hey Will thanks for the information. I will keep that in mind.
Sir, can you give me full code.?
 
© 2008 Will McGugan.

A technoblog blog, design by Will McGugan