Ten numbers on a blackboard

In Ten numbers on a blackboard, someone asks about the largest number you can compute by reducing a set of numbers. I was surprised to see that someone spent quite a bit of time to brute force it and that their Python solution was so slow. So many of the things I write about in the “Benchmarking” and “Profiling” chapters come into play in this conversation.

The particular problem is to reduce a list of numbers by taking them off of the list and replacing them with the single value a*b + a + b. Expressed in Perl, I write:

my @array = 1 .. 10;

my $sum = 0;
foreach ( @array ) {
	$sum = $sum * $_ + $sum + $_;
	}
	
print "Sum is $sum\n";

The answer for that sequence is 39,916,799.

There’s a bit of a trick here by starting with $sum at 0. The first loop gets me back to the first item in the list. It’s a nice trick in programming to include a degenerate case and even nicer that Perl allows and even expects me to start with an undefined value. I could just as well start with the first element directly by shifting it into $sum:

my @array = 1 .. 10;

my $sum = shift @array;
foreach ( @array ) {
	$sum = $sum * $_ + $sum + $_;
	}
	
print "Sum is $sum\n";

Still, the question remains. Would I get the same answer no matter the order of the numbers in the sequence? I can generate all of the permutations of the list, something explained in the perlfaq4’s How do I permute N elements of a list? as well as StackOverflow’s How can I generate all permutations of an array in Perl?.

For this article, I’ll use Algorithm::Permute. It’s a bit of an odd interface because its permute function takes a code block where I get the permutation in an array variable of the same name:

use Algorithm::Permute ('permute');

my @array = 1 .. 10;

permute {
    print "@array\n";
	} @array;

Careful here! There are 10! permutations of the order of 10 distinct thingys. That’s 3,628,8000 lines of output. Let’s start with something a bit smaller. I’ll start with two items:

use Algorithm::Permute ('permute');

@ARGV = 1 unless @ARGV;

my @array = 1 .. $ARGV[0];

permute {
	print "S = [@array] ";
	my $sum = $array[0];
	foreach ( 1 .. $#array ) {
		$sum = $sum * $array[$_] + $sum + $array[$_];
		}

	print "sum is $sum\n";	
	} @array;

Here are a few runs where I see the sum is the same each time (more on that in a moment):

% permute 2
S = [1 2] sum is 5
S = [2 1] sum is 5
% permute 3
S = [1 2 3] sum is 23
S = [1 3 2] sum is 23
S = [3 1 2] sum is 23
S = [2 1 3] sum is 23
S = [2 3 1] sum is 23
S = [3 2 1] sum is 23
% permute 4
S = [1 2 3 4] sum is 119
S = [1 2 4 3] sum is 119
S = [1 4 2 3] sum is 119
S = [4 1 2 3] sum is 119
S = [1 3 2 4] sum is 119
S = [1 3 4 2] sum is 119
S = [1 4 3 2] sum is 119
S = [4 1 3 2] sum is 119
S = [3 1 2 4] sum is 119
S = [3 1 4 2] sum is 119
S = [3 4 1 2] sum is 119
S = [4 3 1 2] sum is 119
S = [2 1 3 4] sum is 119
S = [2 1 4 3] sum is 119
S = [2 4 1 3] sum is 119
S = [4 2 1 3] sum is 119
S = [2 3 1 4] sum is 119
S = [2 3 4 1] sum is 119
S = [2 4 3 1] sum is 119
S = [4 2 3 1] sum is 119
S = [3 2 1 4] sum is 119
S = [3 2 4 1] sum is 119
S = [3 4 2 1] sum is 119
S = [4 3 2 1] sum is 119

Rather than output every sum, I modify the program to find a sum that’s not the same. If I can find one that’s different I know that I might have to brute-force the problem to find the maximum:

use v5.10;

use Algorithm::Permute ('permute');

@ARGV = 1 unless @ARGV;

my @array = 1 .. $ARGV[0];

permute {
	my $sum = $array[0];
	foreach ( 1 .. $#array ) {
		$sum = $sum * $array[$_] + $sum + $array[$_];
		}

	state $last_sum = $sum;

	say "Sum for [@array] is special: $sum" 
		unless $last_sum eq $sum;
	} @array;

I like this trick with state where I initialize it with the first value of $sum, but with this code I don’t see any output. Instead, I can use a hash to count the sums that I compute. If I accumulate more than one sum, I do more work to find the minimum and maximum values:

use v5.10;
use Algorithm::Permute ('permute');
use List::Util qw(min max);

@ARGV = 1 unless @ARGV;

my @array = 1 .. $ARGV[0];

my %Sums;

permute {
	my $sum = $array[0];
	foreach ( 1 .. $#array ) {
		$sum = $sum * $array[$_] + $sum + $array[$_];
		}

	$Sums{$sum}++;
	} @array;

if( 1 < values %Sums ) {
	print "There is more than one sum for n=$ARGV[0]!";
	my $min = min( keys %Sums );
	my $max = max( keys %Sums );
	
	say "Min: $min Max: $max";
	}
else {
	say "There is only one sum for n=$ARGV[0]" . (keys %Sums)[0];
	}

Now for the brute-force. This is the part that caught my eye about imallett's Python solution, which he says:

For what it's worth, after reading the answers I decided to see how long a brute force program would take. My solution takes 0.43 seconds on the analogous problem of length 7, 12.3 seconds for length 8, and 442.1 seconds for length 9. I didn't let it run for length 10.

There are so many odd things about that statement? Twelve seconds? Doesn't that seem odd for 7! (=5040) operations? Remember in Mastering Perl that I strongly emphasized that we can only compare times on the same setups. But, even without trying very hard, my solution is three orders of magnitude faster:

% time perl permute 7
There is only one sum: 40319

real    0m0.033s
user    0m0.027s
sys     0m0.005s

One of imallett's problems is his use of recursion, a popular technique that's great for high-level languages that can compile it into something that isn't recursive. This isn't a problem that needs it. Mark Jason Dominus talks about this quite a bit in Higher-Order Perl too.

Here's imallett Python solution, which I find hard to unravel:

#!/usr/bin/python

import time

def rem_comb(i,j, l):
	l2 = []
	for index in range(len(l)):
		if index == i or index == j: continue
		l2.append(l[index])
	a,b = l[i],l[j]
	l2.append(a*b + a + b)
	return l2

def experiment(l):
	if len(l) == 1:
		return l[0]
	else:
		best = (0, None,None)
		for i in range(len(l)):
			for j in range(i+1,len(l),1):
				value = experiment(rem_comb(i,j, l))
				if value > best[0]:
					best = (value, i,j)
		return best[0]

for length in range(1,10+1,1):
	t0 = time.time()
	value = experiment(list(range(1,length+1,1)))
	t1 = time.time()
	print("Length % 2d best %d (time %f)"%(length,value,t1-t0))

For N=7, it takes less than a second, but then it blows up at N=8 on my machine. These are roughly the same times he reports, but this is awful performance:

Length  1 best 1 (time 0.000006)
Length  2 best 5 (time 0.000015)
Length  3 best 23 (time 0.000043)
Length  4 best 119 (time 0.000223)
Length  5 best 719 (time 0.002201)
Length  6 best 5039 (time 0.032876)
Length  7 best 40319 (time 0.674944)
Length  8 best 362879 (time 18.632982)
Length  9 best 3628799 (time 683.095185)

I also strongly emphasized that in these sorts of issues, it's the algorithm that's usually the problem, not the language or the interpreter. I haven't written Python in a long time (and now I remember why), but this solution is much faster:

#!/usr/bin/python

import time
from itertools import permutations

def add_it_up(p):
	sum = 0;
	for i in p:
		sum = sum * i + sum + i
	return sum

def experiment(l):
	sum = 1;

	last_sum = 0	
	for p in permutations( range( 1, l+1 ) ):
		sum = add_it_up( p )
		if last_sum > 0 and sum != last_sum:
			print( p )
			print( "For %d, found two sums: %d %d" % ( l, sum, last_sum ) )
		last_sum = sum
	return last_sum    	

for length in range(1, 10+1, 1):
	t0 = time.time()
	value = experiment( length )
	t1 = time.time()
	print( "Length % 2d best %s (time %f)" % (length,value,t1-t0) )

This leads to the advice I gave in Mastering Perl. Better algorithms are better than optimization. The accepted answer for this puzzle shows that for 1 .. N, the sum is N! - 1. More importantly, it shows that since all of the operations are commutative, the order doesn't matter so I don't have to compute every sum.

Now I divert from the question, which already has its answer. Now I merely want to make the sums to see how fast I can do it. I already show in the "Profiling" chapter how to compute factorials while saving the result from previous computations (you can see all of the book's programs on the Downloads page:

#!/usr/bin/perl
# factorial_iterate_bignum_memo.pl

use bignum;

{
my @Memo      = (1);

sub factorial {
	my $number = shift;

	return unless int( $number ) == $number;
	return $Memo[$number] if $Memo[$number];

	foreach ( @Memo .. $number ) {
		$Memo[$_] = $Memo[$_ - 1] * $_;
		}

	$Memo[ $number ];
	}
}

{
print "Enter a number> ";
chomp( my $number = <STDIN> );
exit unless defined $number;

print factorial( $number ), "\n";
redo;
}

As I continued to brute-force the problem, I would have to redo quite a bit of work to compute factorials I already know the answer to, but there's no reason to do all that extra work. I modify the factorial program a little (and add some v5.10 features):

use v5.10;

sub factorial {
	my $number = shift;
	state $Memo = [1];

	return unless int( $number ) == $number;
	return $Memo->[$number] if $Memo->[$number];

	foreach ( @$Memo .. $number ) {
		$Memo->[$_] = $Memo->[$_ - 1] * $_;
		}

	$Memo->[ $number ];
	}

foreach ( 1 .. $ARGV[0] ) {
	say "N = $_, sum is ", factorial( $_ ) - 1;
	}

Now I know the sums virtually instantly:

% time perl factorial 100
N = 1, sum is 0
N = 2, sum is 1
N = 3, sum is 5
N = 4, sum is 23
N = 5, sum is 119
N = 6, sum is 719
N = 7, sum is 5039
N = 8, sum is 40319
N = 9, sum is 362879
N = 10, sum is 3628799
...
N = 99, sum is 9.33262154439441e+155
N = 100, sum is 9.33262154439441e+157

real    0m0.007s
user    0m0.003s
sys     0m0.003s

Suppose now, that I did have to check every sum because the operations weren't commutative and the order mattered. How would I make everything faster? I'd do the same thing I did with the factorial example. I would save the result of previous operations, even across different-sized inputs. First, I'll comment out the meat (at line 15) and count the number of calls for each input length:

use v5.10;
use Algorithm::Permute;

my $Memo;
my %Calls;

sub make_key { join "\000", @_ }

sub in_order {
	my $key = make_key( @_ );

	$Calls{ scalar @_ }++;
	
	if( exists $Memo->{$key} ) {
		#return $Memo->{$key};
		}

	if( @_ == 2 ) {
		return $Memo->{$key} = $_[0]*$_[1] + $_[0] + $_[1];
		}
	elsif( @_ == 1 ) {
		return $_[0]
		}		
	
	$Memo->{$key} = in_order(
		in_order( @_[0 .. $#_-2] ),
		in_order( @_[-2, -1] )
		);
	}
	
N: foreach my $n ( 2 .. $ARGV[0] ) {
	my $p = Algorithm::Permute->new( [ 1 .. $n ], $n );
	
	my $last_sum;
	PERMUTE: while( my @res = $p->next ) {
		my $key = make_key( @_ );
		$Memo{ $key } = in_order( @res );
		if( defined $last_sum and $Memo{ $key } != $last_sum ) {
			say "For (@res), sum is different [$last_sum != $Memo{ $key }]";
			next N;
			}
		$last_sum = $Memo{ $key };
		}
		
	say "N = $n: sum is $last_sum";
	}

say Dumper( \%Calls ); use Data::Dumper;

This is still pretty slow, but much faster than the recursive solution in Python:

% time perl memo 10
N = 2: sum is 5
N = 3: sum is 23
N = 4: sum is 119
N = 5: sum is 719
N = 6: sum is 5039
N = 7: sum is 40319
N = 8: sum is 362879
N = 9: sum is 3628799
N = 10: sum is 39916799
$VAR1 = {
          '3' => 368046,
          '1' => 368046,
          '6' => 3669840,
          '5' => 368040,
          '10' => 3628800,
          '8' => 3669120,
          '9' => 362880,
          '4' => 3669864,
          '2' => 35878886,
          '7' => 367920
        };


real    2m53.379s
user    2m51.615s
sys 0m1.697s

Now, I'll uncomment that line and try again. I cut the time in half, roughly, and see there are fewer calls. I strongly emphasize this in the Mastering Perl too. If I think I know the situation, I should make a change to ensure what I understand to happen actually does:

% time perl memo 10
N = 2: sum is 5
N = 3: sum is 23
N = 4: sum is 119
N = 5: sum is 719
N = 6: sum is 5039
N = 7: sum is 40319
N = 8: sum is 362879
N = 9: sum is 3628799
N = 10: sum is 39916799
$VAR1 = {
          '8' => 3669120,
          '2' => 12323810,
          '6' => 1815120,
          '5' => 181560,
          '1' => 504,
          '9' => 362880,
          '3' => 15126,
          '7' => 367920,
          '10' => 3628800,
          '4' => 151224
        };


real    1m27.931s
user    1m26.666s
sys 0m1.251s