Doing less work to compute excellent numbers

When I last looked at excellent numbers, I knew there was more work I could do to optimize what I was doing. In particular, I knew there was some upper limit to the range of numbers I had to check. I didn’t take the time to think about until today. I had a brief foray into other languages, such as my attempt with Julia, as I ran out of time to think about it.

To review, a number is excellent if you can chop its digits in half and the difference between the squares of the back and front halves are the number again. For instance, 48 would be 4 and 8, then 8² – 4² = 64 – 16 = 48.

It’s easy to brute force this but it’s not quick once you get up to 10-digit numbers. I want to work on 22-digit numbers. By that point, I need Perl’s bigint, which slows things down considerably. I had noted earlier that I was getting performance much better than Mark-Jason’s estimate of 1,000 candidates per second, but now his estimate is spot on.

I knew I was checking some numbers which I should already know are not going to be excellent. Consider some number with two halves a and b. I’m going to have to compute b² – a² to get the number back. If I take the maximum possible b² (all 9s), can I figure out the maximum a that will return a number of the right length for that subtraction?

I also know that b² – b is the same as a*10^k + a², where k in the order of magnitude of half the number (e.g. for 1,000, k is 3). It’s a way to shift over the digits in a to their right places in the big number. Now I need to figure out a for the maximum b. I decided to guess until I found it. I was a bit sloppy, but extra time being not sloppy delays the answer (sometimes optimizing means being ugly):

use v5.22;
use feature qw(signatures);
no warnings qw(experimental::signatures);

sub bisect ( $k, $sub, $threshold=1 ) {
	my $b = '9' x $k;  # the largest b

	# the limits for our bisection. These will close in on the middle
	# as we test the bounds
	my $maximum = $b;
	my $minimum = '1' . ( '0' x ($k-1) );

	# Let's start in the middle of the range
	my $try = int( ( $maximum - $minimum ) / 2 );

	my %Seen;
	while( 1 ) {
		my $result = $sub->( $try, $k );
		#say "Min: $minimum: Max: $maximum Try: $try Result: $result";

		my $last_try = $try;
		$try = do {
			if( abs($result - $b) < $threshold ) {
				#say "last block";
				last ;
				}
			elsif( $result > $b )    { # result is too big, make try smaller
				#say "too big";
				$maximum = $try;
				int( $minimum + ( $try - $minimum ) / 2 );
				}
			elsif( $result < $b ) { # result is too small, make try bigger
				#say "too small";
				$minimum = $try;
				int( ($try + ( $maximum - $try ) / 2) + 1 );
				}
			else { return $try }
			};
		last if int( $last_try ) == int( $try );
		last if $Seen{ $try }++; # stop cycles

		#say "next try $try";
		}

	# cheat a little
	return int( $try + $threshold );
	};

Notice that I create some numbers in that subroutine using string operations.

Now I can get pretty close to a without solving for the polynomial, which seemed too complicated for 2am (but which I'm really solving by bisection). And, check out the use of a subroutine signature for the anonymous callback:

my $max_a = bisect(
	$k+1,
	sub ( $a, $k ) { sqrt( $a * 10**$k + $a**2 ) },
	my $threshold = 1
	);

I can construct a table of maximum candidate values. This isn't the maximum excellent number (for k = 1, the greatest excellent number is 48 for instance). I knew that the limit was somewhere close to the beginning of the numbers that started with 6, but now I know:

 1 -> 63
 2 -> 619
 3 -> 6181
 4 -> 61805
 5 -> 618034
 6 -> 6180340
 7 -> 61803400
 8 -> 618033989
 9 -> 6180339888
10 -> 61803398875
11 -> 618033988751
12 -> 6180339887499
13 -> 61803398874990
14 -> 618033988749895
15 -> 6180339887498949
16 -> 61803398874989485
17 -> 618033988749894849
18 -> 6180339887498948483
19 -> 61803398874989484821
20 -> 618033988749894848205
21 -> 6180339887498948482046
22 -> 61803398874989484820459
23 -> 618033988749894848204588
24 -> 6180339887498948482045869
25 -> 61803398874989484820458684

If I'm trying the values of a from 1,000 to 9,999, I can stop at 6,181. Out of 9,000 candidates, I only have to check about 5,000 of them. I'm eliminated about 45% of the candidates. I didn't make the actual computations any faster; I merely don't check hopeless cases. I get the excellent numbers in a particular range as slowly as I did before, but my program terminates sooner (or moves on to the next range sooner).

Now that I've gone through this process it seems obvious but I'm almost embarrassed to say to I had to puzzle it out on paper in an evening. But, an evening scribbling on paper often saves quite a bit of run time.