talideon.com

Coding for fun and prophet!

January 3, 2009 at 10:19AM Minesweeper Kata

I did the minesweeper kata a few days back. It was a bit too easy, though it’s quite possible that this is more an artifact of how I approached it. Given that it’s a simple one, it might be an idea to see if I can take it in smaller steps, much smaller steps.

There’s not really much to say, so here’s the code:

#!/usr/bin/env python

import doctest
import sys

def unpack_line(s):
    """
    >>> unpack_line('.....')
    [0, 0, 0, 0, 0]
    >>> unpack_line('')
    []
    >>> unpack_line('.*.')
    [0, '*', 0]
    """
    line = []
    for c in s:
        if c == '*':
            line.append('*')
        else:
            line.append(0)
    return line

def mark_adjacents(lines):
    """
    >>> mark_adjacents([[0, 0], [0, 0]])
    [[0, 0], [0, 0]]
    >>> mark_adjacents([['*', 0], [0, 0]])
    [['*', 1], [1, 1]]
    >>> mark_adjacents([['*', 0], [0, '*']])
    [['*', 2], [2, '*']]
    >>> mark_adjacents([[0, 0, 0], [0, '*', 0], [0, 0, 0]])
    [[1, 1, 1], [1, '*', 1], [1, 1, 1]]
    >>> mark_adjacents([['*', 0, 0], [0, 0, 0], [0, 0, '*']])
    [['*', 1, 0], [1, 2, 1], [0, 1, '*']]
    """
    for y in range(len(lines)):
        for x in range(len(lines[y])):
            if lines[y][x] == '*':
                for oy in [y - 1, y, y + 1]:
                    if oy < 0 or oy == len(lines):
                        continue
                    for ox in [x - 1, x, x + 1]:
                        if ox < 0 or ox == len(lines[y]):
                            continue
                        if lines[oy][ox] != '*':
                            lines[oy][ox] += 1
    return lines

def process_fields(iterable):
    start = True
    for line in iterable:
        line = line.rstrip()
        if start:
            start = False
            lines = []
            height, width = [int(x) for x in line.split(' ')]
            if width == 0 and height == 0:
                break
        else:
            lines.append(unpack_line(line))
            height -= 1
            if height == 0:
                start = True
                yield mark_adjacents(lines)

def main():
    field = 0
    for f in process_fields(sys.stdin):
        field += 1
        if field > 1:
            print
        print "Field #%d:" % field
        for l in f:
            print ''.join([str(x) for x in l])

def _test():
    doctest.testmod()

if __name__ == '__main__':
    main()

I should really buckle down and finish the code that’s meant to replace the crappy one-evening hack the this blog runs on.

December 31, 2008 at 12:44AM Harry Potter Kata

I did the Harry Potter Kata earlier because I was feeling a little bored. Here’s the result. I use cents rather than euros as my unit of currency here.

I’d a start when I was thinking about how I’d solve the problem. I thought that being greedy and trying to make the biggest sets would do the trick, but the first example test case given soon managed to disabuse me of that notion.

Then I got the idea of turning things around 90 degrees by creating a set of buckets within the basket and distributing the sets between them. However, given the sample test case, I realised that the maximum discount could come from an obscure distribution of books, so I took the brute force route and decided to enumerate all the possible ways to distribute all the books of a certain volume, tracking which combination gave the greatest discount, and did this iteratively for each volume.

I already had the intuition to eliminate the larger groups of volumes sooner than the smaller ones, and I knew that the important thing was the number of books of each volume there were rather than what the particular volumes were, so sorting basket by volume made sense and avoided the need to do any backtracking later.

I looked at other implementations after I’d finished, and mine seems to be sorter and more general--though not purposely so--than most, but I’m still pretty sure there’s a much more elegant way to find an ideal distribution of volumes than the brute force method I used, though I’ve yet to think of one. Here’s the code:

#!/usr/bin/env python

import doctest
import itertools

def sum_basket(basket, unit_price, discount_pcs):
    """
    Gets the sum price of the books in the basket.

    basket - list of number of each book
    unit_price - price of a book
    discount_pcs - percentage discount to apply for each additional book in a set

    >>> price = 800
    >>> discount_pcs = [0, 0, 5, 10, 20, 25]
    >>> sum_basket([0, 0, 0, 0, 0], price, discount_pcs)
    0
    >>> sum_basket([0, 1, 0, 0, 0], price, discount_pcs)
    800
    >>> sum_basket([0, 3, 0, 0, 0], price, discount_pcs)
    2400
    >>> sum_basket([1, 1, 0, 0, 0], price, discount_pcs)
    1520
    >>> sum_basket([2, 1, 2, 1, 0], price, discount_pcs)
    4080
    >>> sum_basket([2, 1, 1, 1, 1], price, discount_pcs)
    3800
    >>> sum_basket([2, 2, 2, 1, 1], price, discount_pcs)
    5120
    >>> sum_basket([1, 1, 2, 2, 2], price, discount_pcs)
    5120
    >>> sum_basket([5, 5, 4, 5, 4], price, discount_pcs)
    14120
    """
    buckets = get_basket_buckets(basket)
    result = 0

    for books in sorted(basket, reverse=True):
        best = None
        for patterns in bitmap_combinations(books, len(buckets)):
            attempt = [x + y for x, y in itertools.izip(buckets, patterns)]
            attempt_sum = 0
            for bucket in attempt:
                attempt_sum += sum_bucket(bucket, unit_price, discount_pcs)
            if best is None or attempt_sum < result:
                best = attempt
                result = attempt_sum
        buckets = best

    return result

def sum_bucket(bucket, unit_price, discount_pcs):
    """
    Gets the sum cost of a set of books.

    bucket - number of unique books in a set.
    unit_price - price of a book
    discount_pcs - percentage discount to apply for each additional book in a set

    >>> price = 800
    >>> discount_pcs = [0, 0, 5, 10, 20, 25]
    >>> sum_bucket(0, price, discount_pcs)
    0
    >>> sum_bucket(1, price, discount_pcs)
    800
    >>> sum_bucket(2, price, discount_pcs)
    1520
    >>> sum_bucket(3, price, discount_pcs)
    2160
    >>> sum_bucket(4, price, discount_pcs)
    2560
    >>> sum_bucket(5, price, discount_pcs)
    3000
    """
    return bucket * unit_price * (100 - discount_pcs[bucket]) / 100

def bitmap_combinations(ones, length):
    if length == 0:
        yield []
    else:
        if ones < length:
            for tail in bitmap_combinations(ones, length - 1):
                yield itertools.chain([0], tail)
        if ones > 0:
            for tail in bitmap_combinations(ones - 1, length - 1):
                yield itertools.chain([1], tail)

def get_basket_buckets(basket):
    """
    Creates enough basket buckets to hold the maximum number of unique book sets in
    the basket.

    >>> get_basket_buckets([0])
    []
    >>> get_basket_buckets([0, 0])
    []
    >>> get_basket_buckets([1, 0])
    [0]
    >>> get_basket_buckets([0, 1])
    [0]
    >>> get_basket_buckets([0, 2])
    [0, 0]
    >>> get_basket_buckets([1, 2])
    [0, 0]
    >>> get_basket_buckets([1, 2, 3, 5, 2])
    [0, 0, 0, 0, 0]
    """
    return [0 for i in range(max(basket))]

def _test():
    doctest.testmod()

if __name__ == '__main__':
    _test()

December 18, 2008 at 8:12PM Testing embedding stuff from Github Gist: pipedarg

I knocked this out earlier because I needed something to allow me to open HTML document attachments in Firefox from mutt. It writes standard input to a temporary file and then executes the remainder of the arguments passed to the script, with the path of the temporary file given as the final argument. Might be useful to somebody.

It was also a good opportunity to test out embedding a Github Gist. If you can’t see it, here, you can see the source for pipedarg here.

Update (27th December): I wasn’t thinking straight when I wrote that, and I’d corrected the script a couple of days after posting it up. You can see what changes I made from the history.

December 5, 2008 at 5:25PM PHP SimpleXML bug: prefix required for attributes in namespace other than the default namespace

I’ve just submitted PHP Bug #46769. This bit me at work because it caused the tests for the EPP client code to start failing suddenly on my machine.

The story is that the fix for Bug #43221 made prefixes on attribute qnames required if the namespace of that attribute was anything other than the default namespace.

Previously to this ‘fix’, libxml2 was let do its job and was able to figure out what the prefix was from the namespace. The ‘fix’ prevents libxml2’s ability to do this.

Here’s the example I posted up demonstrating the bug:

<?php
$ns_foo = "tag:example.com,2008:foo";
$ns_xsi = "http://www.w3.org/2001/XMLSchema-instance";

$root_doc = <<<EOT
<?xml version="1.0" encoding="UTF-8"?>
<a xmlns:xsi="$ns_xsi"
   xmlns="tag:example.com,2008"
   xsi:schemaLocation="tag:example.com,2008 root.xsd"
   xmlns:foo="$ns_foo"/>
EOT;

$root = simplexml_load_string($root_doc);
$child = $root->addChild('bar', null, $ns_foo);
$child->addAttribute('schemaLocation', "$ns_foo foo.xsd", $ns_xsi);
print_r($root->asXml());

What should be happening and what was happening previously was that because the XML schema instance namespace was declared on a parent element of <foo:bar/>, the correct document was being generated:

<?xml version="1.0" encoding="UTF-8"?>
<a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="tag:example.com,2008" xmlns:foo="tag:example.com,2008:foo"
   xsi:schemaLocation="tag:example.com,2008 root.xsd">
  <foo:bar xsi:schemaLocation="tag:example.com,2008:foo foo.xsd"/>
</a>

However now this code triggers the warning “Warning: SimpleXMLElement::addAttribute(): Attribute requires prefix for namespace” and the attribute is no longer being added to <foo:bar/>.

This broke the EPP client badly because the code for building requests declares the namespaces on the root element of the document and relies on being able to reference them later by name. The upshot of this is that our EPP client (and, no doubt, other code) cannot be deployed on PHP 5.2.6 and later.

We’ll see how this is followed up. Suffice it to say that this is not the best start to the weekend, and that once again unit testing has displayed its awesomeness.

December 3, 2008 at 5:36PM Custom error handlers + the @-operator in PHP = FAIL

This is definitely up there as one of the most retarded PHP misfeatures I’ve encountered.

One of our systems in work needs to check to see if the host given in an email address can receive mail. For that, we’ve a nice short function that does exactly what RFC5821, and all the other SMTP RFCs say to do to check for this:

function can_host_receive_mail($host) {
    if (checkdnsrr($host, 'MX')) {
        return true;
    }
    if (checkdnsrr($host, 'A') || checkdnsrr($host, 'AAAA')) {
        $h = @fsockopen($host, 25, $_errno, $_errstr, 30);
        if ($h) {
            fclose($h);
            return true;
        }
    }
    return false;
}

Simple, right? Wrong. The key line here is the call to fsockopen(), which uses the @ operator to suppress the warning if the host can’t be contacted on port 25. Only the code is behaving as if the @ operator isn’t there at all.

You see, the particular piece of software where this is causing a problem has a custom error handler that intercepts PHP errors and converts them to exceptions to allow uniform error handling. However, it appears that if you’ve a custom error handler, the @ operator is rendered useless.

This is a big problem, especially with fsockopen() because not being able to connect to a host on a given port is, while a little exceptional, not an error; it’s one of these cases where you want an error code rather than an exception because coping with failure in this case is part of the normal flow of the program. That is why PHP triggers an E_WARNING rather than an E_ERROR, and why I was using the @ operator up above.

This breaks if you’re using a custom error handler, and @ stops suppressing the warning in spite of the fact that the presence of @ before an expression should logically by virtue of its locality override any PHP error (as opposed to exception) handling.

To work around this, I altered it to catch the exception the error handler throws:

function can_host_receive_mail($host) {
    if (checkdnsrr($host, 'MX')) {
        return true;
    }
    if (checkdnsrr($host, 'A') || checkdnsrr($host, 'AAAA')) {
        try {
            $h = fsockopen($host, 25, $_errno, $_errstr, 30);
        } catch (AFK_TrappedErrorException $ex) {
            return false;
        }
        if ($h) {
            fclose($h);
            return true;
        }
    }
    return false;
}

Damn, that sucks.

A custom error handler should not render the @ operator useless; it should allow suppression of the custom error handler too. The @ operator is there for a reason.

November 20, 2008 at 4:38PM Things I hate: being asked for my title, given name, and surname in forms

It’s insane and culturally ignorant to do this. If you look at the way people’s names work you’ll notice that only a small, small number of people actually have names that fit into that format.

Take Irish names which include ‘O’ (such as “O’Hara” or “O Hara”, depending on preference) or ‘Mc’ (such as “McAllister”): I’ve seen names rejected because of the presence of the apostrophe, the ‘O’ treated like a middle initial, ‘Mc’ and ‘Mac’ treated like middle names, and the likes of “McAllister” normalised ignorantly to ‘Mcallister’. Or take the Irish form of the names ‘McGillacuddy’ and ‘McAleese’, which are ‘Mac Giolla Chuda’ and ‘Mac Giolla Iosa’: I’ll leave how they get mangled up to your imagination.

And then there’s double-barrel names. Spaniards must cringe at the way their names get mangled.

Hungarians too, seeing as they put their surname first and their given name after. Ditto for the Chinese, Japanese, and many other cultures.

Indonesians only have the one name. Yup, one name. Suck on that!

There’s only two sane things to do:

  1. Ask for their name, their whole name, and only their name,
  2. But if you really, really need to know how they prefer to be addressed, ask them that too, but make it optional.

Let’s take a simple example using my own name. I prefer to give my name as ‘Keith Gaughan’, and I prefer being addressed as ‘Keith’, not ‘Mr. Gaughan’ or any of that crap, but if I was 50 years of age, I might start to prefer that, or ‘Dr. Gaughan’, if I get a PhD in the future. Similarly, if I was to use my mother’s maiden name in my full name, it’d be ‘Keith Gaughan-McAllister’, but I might want to be referred to as ‘Mr. Gaughan’.

If you stick to either of these two schemes, you’ll be able asking people in a way that will work anywhere and won’t appear quite so ignorant or quasi-eurocentric.

Aside: I’m not giving any decent examples here for a very stupid reason: the idiot database driver on the server that hosts my site kills high-bit characters and doesn’t like UTF-8. Suckage.

Update: I remember finding this article ages back on the subject and agreeing vehemently. Take a read, it’s a good one.

October 22, 2008 at 8:50PM Mayhem in Monsterland on Wii Virtual Console = FAIL

It’s almost painful to watch the second part of this after watching the comparison with the original. This is one of my favourite games, and last time I played it under VICE, it worked perfectly, so how did Nintendo manage to screw things up so badly? Do Nintendo have some kind of irrational GPL allergy or something?