Programming challenges

Jplus · Post by **Jplus** » Fri Nov 25, 2022 12:34 pm

On the old forums, there was Programming Wars (unfortunately I could not get hold of it on archive.org). If there is enough interest, I would enjoy doing something similar here, though not necessarily as competitive.

Suggestion for rules:

The game should go in non-overlapping rounds, each hosted by at least one participant.
The host defines a task that can be solved by writing a (short) computer program.
The host provides clear criteria for awards such as passing, winning, and/or earning badges or (bonus) points.
Other rules are also up to the host, for example whether multiple solutions or teams are allowed (or required).
The host sets a deadline, preferably a relaxed one.
Ideally, participants are free to use any programming language of choice.
The other participants send the source code of their solutions to the host by PM, along with any information required in order to run the code.
After the deadline, the host publishes all the source code submissions along with awards and (optional) comments.
All people who submitted a solution for the current round should get an opportunity to respond after the results have been published and before the next round is started.
The next host can be chosen based on consensus; if no consensus is reached, the current host decides.

Let us start when at least three people have expressed interest (me included).
I will try to update this opening post with links to the opening and closing posts of each round.
I am willing to be the first host, but also more than happy if somebody else wants to kick off.

Index of rounds

Sorting algorithms: task, results
Rock, paper, scissors: task, results
Maze generator and/or solver: task

phillip1882 · Post by **phillip1882** » Fri Nov 25, 2022 2:32 pm

i'd love this

somitomi · Post by **somitomi** » Fri Nov 25, 2022 7:19 pm

that sounds fun

commodorejohn · Post by **commodorejohn** » Fri Nov 25, 2022 7:58 pm

Definitely intrigued.

Jplus · Post by **Jplus** » Sat Nov 26, 2022 1:41 am

So we have enough interest. Cool! Nobody expressed a wish to host yet, so I will take the first turn.

I have always loved sorting algorithm animations, for example at https://www.toptal.com/developers/sorting-algorithms. I even wrote a sorting algorithm library at one point, which I never published. Hence the first challenge: implement a sorting algorithm!

Task and rules

You will write a sort function. At the very least, this function should be able to sort a sequence of integers in ascending order. You can use the following example sequences to help yourself assess whether you are on the right track, although I will be testing your submissions with other sequences as well. Input sequence on the left, expected output sequence on the right:

Code: Select all

[] -> []
[1] -> [1]
[1, 2, 3] -> [1, 2, 3]
[3, 2, 1] -> [1, 2, 3]
[8, 4, 7, 3, 9, 2, 7, 5] -> [2, 3, 4, 5, 7, 7, 8, 9]
[2, 1, 1, 1, 2, 1, 2, 2, 1, 2] -> [1, 1, 1, 1, 1, 2, 2, 2, 2, 2]

Below is an example of a Python function that meets this minimal requirement:

Code: Select all

# This is an implementation of gnome sort. You can think of it as a gnome
# comparing two adjacent elements at a time, optionally swapping the elements
# and moving one position left or right at a time.
def sort(data):
    index = 0
    end = len(data) - 1
    while index < end:
        left = data[index]
        right = data[index + 1]
        if right < left:
            # not sorted, swap the elements and move back
            data[index:(index + 2)] = [right, left]
            index = max(0, index - 1)
        else:
            # already sorted, move forward
            index += 1

Of course, a solution may include other functions as well, but the function with the name sort is understood to be the entry point.

You may accept the input data in a form that is conventional for your programming language, for example a list in Haskell, a table in Lua or two separate iterator arguments in C++. It is also up to you whether you modify the data in place, like in the example above, or leave the input data intact and produce a sorted copy as output.

Not everyone may be familiar with sorting algorithms. It is allowed to study existing sorting algorithms before you start. You may implement an existing algorithm, including the example above; you do not need to invent a new one. However, you are not allowed to look at example code, pseudocode, visualizations or other explanatory material on sorting algorithms while programming. In the end, you have to get your code to work on your own.

You are allowed to submit multiple solutions. It is also allowed to cooperate with other participants. You can work alone or in a team on a case-by-case basis, so for example one solution alone, one with partner A and one with partner B.

Awards

Solutions, rather than people, are awarded emoji badges based on particular qualities. Of course, submitters will be credited for their submissions. Other than that, nobody will earn points and no definitive winner will be chosen.

There is a broad range of possible badges, in order to cater to different strengths. Note that some badges are negative.

Spoiler (Show/Hide)

A solution that meets the minimal requirement is considered to pass, which earns it a

. A solution can receive up to five additional

for additional features:

Supporting arbitrary data types, for example also strings.
Supporting descending order as an option.
Supporting not just descending order but arbitrary special sorting criteria, for example case-insensitive for strings (note that there are ways to support custom criteria generically).
Being stable, which means that equivalent elements of the input will keep the same relative order.
Multithread support.

A solution can earn up to six

for ethical and esthetical code qualities:

Being completely DRY ("Don't Repeat Yourself").
Modularity, promoting generic code reuse.
Being very clear in name choice and markup.
Useful comments (documenting the "why" rather than the "what").
Striking me as very creative or original (admittedly subjective).
Being very amusing (again, subjective).

In each of the following categories, a

is awarded to the solution with the best asymptotic complexity. In the first two categories, a

is also awarded to the solution with the best complexity for input data with exactly eight elements. A

is awarded if a solution includes costly operations that do not fit in one of the below categories.

Total number of order-determining operations, such as comparisons or radix computations.
Total number of copies/moves. Swapping two elements always counts as three copies, even if your programming language has a notation that hides this or if you use arithmetic tricks that technically avoid copying. For fairer comparison with Haskell and similarly immutable programming languages, solutions that modify the input are pretended to work on a copy.
Amount of additional memory required. Again, for fairer comparison with immutable programming languages, solutions that modify the input are pretended to copy it first.
Cache misses: how many times a theoretical CPU with a L1 cache that can fit 10k elements would need to load a different part of the input from RAM or L2+ cache.
Thread synchronization operations, such as mutex locks. This category is only awarded if there are at least two multi-threaded solutions.

A solution can earn a

for each type of input for which it performs particularly well and a

for each type of input that can seriously degrade its performance (or even break it).

If I find enough time, I will port all solutions to C++ and measure their actual speed. Solutions that modify the input will be wrapped to operate on a copy and lazy datastructures will be ported using Okasaki (explanation here), unless I see a (relatively) straightforward way to do it using just iterators. In each input size category from 10 to 10M, by steps of a factor 10, a

goes to the solution with the least total user CPU time. If any solution is multithreaded, there is also a

in the largest four categories for the least elapsed wall clock time.

If a solution has a positive quality I did not think of beforehand, but which I think deserves a reward, I may give it a single

.

Deadline

Sunday, January 1 2023 at 23:00 GMT. I will try to publish the results within two weeks after the deadline.

Questions

Please feel free to ask for clarifications. Discussion is also welcome.

phillip1882 · Post by **phillip1882** » Mon Nov 28, 2022 9:35 pm

i'd like to host the next one. i have a couple contest ideas.

Jplus · Post by **Jplus** » Wed Nov 30, 2022 2:03 am

Tip for everyone participating in round 1: double-check your solutions with longer sequences. I suggest at least 100 elements, possibly 10k if your algorithm is very sophisticated. Try random data, already sorted data, reverse-sorted data, and random data with few unique values that are repeated often (for example 100 elements with only the values 1, 2, 3 or 4).

phillip1882 wrote: ↑Mon Nov 28, 2022 9:35 pm i'd like to host the next one. i have a couple contest ideas.

Noted! I look forward to being on the submitting side.

Jplus · Post by **Jplus** » Mon Dec 05, 2022 10:59 pm

I wrote the following in the challenge post of round 1 (emphasis added here):

Jplus wrote: ↑Sat Nov 26, 2022 1:41 am In each of the following categories, a is awarded to the solution with the best asymptotic complexity. (...)
(...)

Total number of copies/moves. (...) For fairer comparison with Haskell and similarly immutable programming languages, solutions that modify the input are pretended to work on a copy.

Amount of additional memory required. Again, for fairer comparison with immutable programming languages, solutions that modify the input are pretended to copy it first.

At second thought, I think this is unjust. In-place algorithms deserve a reward for efficiency. Also, while not straightforward, in-place sorting in Haskell is certainly possible.

Will anyone be sad if I remove the above pretense that in-place algorithms copy the input? For a typical quicksort in Python or C++, that would mean that the additional space is O(log N), as it is commonly understood, rather than O(N), as I would have pretended under the above rules.

phillip1882 · Post by **phillip1882** » Tue Dec 06, 2022 12:55 am

i dont mind either way

phillip1882 · Post by **phillip1882** » Thu Dec 15, 2022 1:16 am

so what all data types should we code for? i coded for standard sort with integers, strings, decreasing and ignore case. is there anything else?

Jplus · Post by **Jplus** » Thu Dec 15, 2022 1:51 am

In principle, arrays of pretty much anything could be sorted, for example also:

class instances (a.k.a. objects), possibly by a single (derived) property/attribute
arrays/vectors/lists (so I mean sorting an array of arrays)
hashes/maps/tables/dictionaries (possibly only by a subset of keys)
complex numbers, for example by absolute value
and so forth (your fantasy is the limit)

Notice what I wrote in the task description (emphasis added here):

Jplus wrote: ↑Sat Nov 26, 2022 1:41 am A solution that meets the minimal requirement is considered to pass, which earns it a . A solution can receive up to five additional for additional features:
(...)

Supporting not just descending order but arbitrary special sorting criteria, for example case-insensitive for strings (note that there are ways to support custom criteria generically).

phillip1882 · Post by **phillip1882** » Tue Jan 03, 2023 4:05 pm

I thought I should go ahead and post the challenge, no need to wait for the results of the previous one.
So, the challenge is: write a strictly deterministic game of rock, paper, scissors. no random functions allowed.
Parameters
your function will receive three inputs, the round number, you opponents previous moves, and your previous moves.
the moves will be in a format of 0 for rock, 1 for paper, and 2 for scissors.
your program can either return 0, 1, or 2; or "R", "P", or "S".
I ask you program in python, if you don't, write comments that will help me convert to python.
each participant can submit up to 5 programs.
each program will face every other program in round robin format.
there will be 5000 rounds per match
Victory conditions
your program will receive a reward for having the greatest number of match wins, as well as greatest number of round wins.
i'l also give a reward for clear concise code that plays decently.

here's some example programs

Code: Select all

def alwaysRock(i,mymoves,myopp):
   return 0

def counterPrev(i,mymoves, myopp):
   if i == 0:
      return 2
   else:
      return (myopp[-1]+1)%3 

def rotateRPS(i,mymoves,myopp):
   return i%3

deadline is march 1st.

somitomi · Post by **somitomi** » Tue Jan 03, 2023 9:09 pm

Oh dang, I kinda forgot about this one

phillip1882 · Post by **phillip1882** » Tue Jan 03, 2023 9:30 pm

have you not visited he forums in a while?

commodorejohn · Post by **commodorejohn** » Wed Jan 04, 2023 1:03 am

Hmm...!

Jplus · Post by **Jplus** » Fri Jan 06, 2023 4:15 pm

Opening post of round 1 here.

The deadline for round 1 has passed and the results are in! Below, all passing solutions by order of submission date.

Since all solutions were in Python, I decided to just do the performance measurements in Python, which is also the language in which I wrote the verification program. These measurements are based on total CPU time rather than user CPU time, and include time needed for preparing the input data. All solutions were fed the same data, so the comparisons are fair, though imprecision is causing ties to be broken arbitrarily. I sorted only a single a uniformly distributed array for the million element measurement and skipped the 10M measurement altogether, because I was impatient.

For those unfamiliar with complexity analysis and big O notation, there is a gentle introduction here. For the finer details, you can refer to Wikipedia. In the cache miss complexity analyses, I use the letter c to represent the capacity of the cache.

somitomi, if you would still like to submit a solution for round 1, I am happy to grade it with awards, though the rockets are already decided. Please don't read the spoilers yet in that case.

1. An improved insertion sort by phillip1882

Spoiler (Show/Hide)

In regular insertion sort, subsequent elements are inserted into the already sorted part of the range. The correct insertion point is found by linearly searching from the back of the range. This variant of insertion sort performs O(N²) comparisons and moves in the average and worst case, but only O(N) comparisons and moves for (nearly) sorted ranges.

There is a known optimization for insertion sort, where the insertion point is found using binary search. This fixes the number of comparisons (but not the number of moves) to O(N log N) in the best, average and worst case.

phillip1882 seemingly independently came up with a similar concept using a type of quintenary search. Instead of recursively splitting the sorted range in half until the correct position is found, it jumps through the range in decreasing powers of 5, going forward on the odd passes and backward on the even passes. It finishes with a short linear injection if the position found in this way is not exactly right.

At least, that was the idea. In this early version, phillip1882 forgot to flip the comparison when going backwards, which effectively disabled the optimization for lower powers of 5. Because of this, it is "only" five times faster than regular insertion sort in the average case. His third submission fixes this.

Awards and complexity

passes all tests

supports arbitrary data types thanks to Python's __lt__

original algorithm
comparisons: best N, average N(N-1)/20, worst N(N-1)/10
moves: best N, average N(N-1)/4, worst N(N-1)/2
additional memory: always N
cache misses: best N/c, average N(N-1)/4c, worst N(N-1)/2c

performs O(N) logarithm and power computations

O(N) comparisons and moves on sorted ranges

O(N) comparisons on reverse-sorted ranges

10 elements took 41.3 µs per array
100 elements took 649 µs per array
1000 elements took 16.8 ms per array
1e4 elements took 650 ms per array

bug-free version submitted already one day after the publication of the challenge. I found this impressive, especially given the original algorithm.

Note to the code: I made some small changes to make the program fit in with my testing harness. Those changes are highlighted below with end-of-line comments, with a minus indicating lines that I removed and a plus indicating lines that I added.

Code: Select all

import math
def sorting(data):
   if len(data) <=1:
      # print(data) # -
      return data   # +
   bucket = [data[0]]
   i = 1
   while i< len(data):
      j = 0
      kstep = 0
      step = int(math.log(len(bucket),5)-1)
      sign = 1
      if data[i] >bucket[-1]:
            bucket += [data[i]]
            i+= 1
            continue
      if data[i] < bucket[0]:
            bucket.insert(0,data[i])
            i += 1
            continue
      while kstep <= step:
         while data[i] > bucket[j]:
            j+= sign*5**(step -kstep)
            if j >= len(bucket):
               j = len(bucket)-1
               break
            if j < 0:
               j = 0
               break
            # print(j) # -
         kstep += 1
         sign = -1*sign

      bucket.insert(j,data[i])
      check = False
      j+= 1
      while j >= len(bucket):
         j -= 1
      while j > 0:
         if bucket[j] < bucket[j-1]:
            bucket[j], bucket[j-1] = bucket[j-1], bucket[j]
         elif check == False:
            if bucket[j] > bucket[j-1]:
               check = True
         else:
            break
         j -= 1
      i+=1
   # print(bucket) # -
   return bucket   # +

2. Smoothsort by phillip1882

(

)

Spoiler (Show/Hide)

I was extra impressed when phillip1882 submitted a second passing solution within a week, based on smoothsort. I had heard of this fancy algorith before, but so far had been unable to wrap my head around it. I had to write my own implementation of smoothsort first, before I could fully understand phillip1882's.

This version of smoothsort is not as efficient as the original version by Edsger Dijkstra.

Awards and complexity

passes all tests

supports arbitrary data types thanks to Python's __lt__
comparisons: always 2N(log_φ N)²
moves: best N, average and worst 2N(log_φ N)²
additional memory: always N + log_φ N
cache misses: always 2N(log_φ N)²/c

the compact Leonardo heap structure is recomputed twice for every element, which requires O(N (log N)²) additional comparisons and additions
10 elements took 105 µs per array
100 elements took 2.54 ms per array
1000 elements took 49.6 ms per array
1e4 elements took 821 ms per array
1e5 elements took 12.6 s per array

1e6 elements took 213 s per array

(would likely have won at 1e7 elements)

OMG smoothsort!

Code: Select all

def smoothsort(data,i,tree):
   structi =structure(data,i,tree)
   head = len(structi)-1
   if len(structi) == 1:
      i -= 1
      bubbleDown(data,i,tree,structi,0)
      return
   pos = i-1
   step = tree[structi[0]]-1
   sortHeads(data,pos,structi,tree)
   head = 0
   while True:
      if tree[structi[head]] == 1:
         break
      bubbleDown(data, step, tree, structi, head)
      head += 1
      if head >= len(structi):
         break
      step += tree[structi[head]]
   return

def sortHeads(data,pos,structi,tree):
   head = len(structi)-1
   step = tree[structi[head]]
   tpos = pos
   for i in range(1,len(structi)):
      while pos >0:

         if data[pos] < data[pos-step]:
            data[pos], data[pos-step] = data[pos-step],data[pos]
         pos -= step
         head -= 1
         if head == 0:
            break
         step = tree[structi[head]]
      pos = tpos
      head = len(structi)-1
      step = tree[structi[head]]
   return

def bubbleDown(data,position,tree,structi,head):
   instep = structi[head]
   j = 2
   while instep-j > -1 and position > 1:
      maxi = -1
      if instep-j == 1 or instep-j == 0:
         child1 = data[position-1]
         child2 = data[position-2]
      else:
         child1 = data[position-1]
         child2 = data[position-tree[instep-j]-1]
      if child1 >= child2:
         maxi = position-1
         j += 2
      else:
         maxi = position -tree[instep-j]-1
         j += 1
      if data[position] < data[maxi]:
         data[position], data[maxi] = data[maxi], data[position]
      position = maxi

   return

def structure(data,i,tree):
   structi = []
   while i > 0:
      size = 1
      while tree[size] < i:
         size += 1
      if tree[size] == i:
         i = 0
         structi += [size]
      else:
         size -= 1
         structi += [size]
         i -= tree[size]
   return structi


def sorting(data):
   if len(data)<=1:
      return data
   tree =[1,1]
   while tree[-1]<len(data):
      tree += [tree[-1] +tree[-2] +1]
   for i in range(1,len(data)+1):
      smoothsort(data,i,tree)
   for i in range(len(data)-1,0,-1):
      smoothsort(data,i,tree)
   return data

3. Truly quaternary insertion sort by phillip1882

Spoiler (Show/Hide)

This is the same algorithm as in phillip1882's first submission, but now working entirely as intended and based on powers of 4. Despite the still quadratic number of moves, this algorithm approaches the speed of proper divide-and-conquer algorithms such as quicksort and mergesort.

Awards and complexity

passes all tests

supports arbitrary data types thanks to Python's __lt__

original algorithm
comparisons: best N, average and worst N log₄ N
moves: best N, average N(N-1)/4, worst N(N-1)/2
additional memory: always N
cache misses: best N log₄(N/c), average N(N-1)/4c, worst N(N-1)/2c

performs O(N) logarithm and power computations

O(N) comparisons and moves on sorted ranges

O(N) comparisons on reverse-sorted ranges
10 elements took 41.8 µs per array
100 elements took 560 µs per array
1000 elements took 7.95 ms per array
1e4 elements took 118 ms per array
1e5 elements took 3.49 s per array
1e6 elements took 227 s per array

Code: Select all

import math
def sorting(data):
   if len(data) <=1:
      return data
   bucket = [data[0]]
   i = 1
   while i< len(data):
      j = 0
      kstep = 0
      step = int(math.log(len(bucket),4))
      sign = 1
      if data[i] >bucket[-1]:
         bucket += [data[i]]
         i+= 1
         continue
      if data[i] < bucket[0]:
         bucket.insert(0,data[i])
         i += 1
         continue
      while kstep <= step:
         if sign == 1:
            while data[i] > bucket[j]:
               j+= sign*4**(step -kstep)
               if j >= len(bucket):
                  j = len(bucket)-1
                  break
               if j < 0:
                  j = 0
                  break
            kstep += 1
            sign = -1*sign
         else:
            while data[i] < bucket[j]:
               j+= sign*4**(step -kstep)
               if j >= len(bucket):
                  j = len(bucket)-1
                  break
               if j < 0:
                  j = 0
                  break
            kstep += 1
            sign = -1*sign
      k = j
      bucket.insert(j,data[i])
      while k > 0 and bucket[k-1] > bucket[k]:
         bucket[k-1], bucket[k] = bucket[k] , bucket[k-1]
         k-=1
      while j<len(bucket) and bucket[j+1] < bucket[j]:
         bucket[j+1], bucket[j] = bucket[j] , bucket[j+1]
         j+= 1
      i+=1
   return bucket

4. Quaternary insertion sort with extra options by phillip1882

Spoiler (Show/Hide)

This is the same algorithm as the previous solution, but with an option to choose between "normal", case-insensitive, reverse, and reverse case-insensitive ordering. It achieves this by copying the code four times, with small changes on the lines that do the comparison.

Awards and complexity

passes all tests

supports arbitrary data types thanks to Python's __lt__

supports descending order as an option

original algorithm
comparisons: best N, average and worst N log₄ N
moves: best N, average N(N-1)/4, worst N(N-1)/2
additional memory: always N
cache misses: best N log₄(N/c), average N(N-1)/4c, worst N(N-1)/2c

performs O(N) logarithm and power computations

O(N) comparisons and moves on sorted ranges

O(N) comparisons on reverse-sorted ranges
10 elements took 43.0 µs per array

100 elements took 551 µs per array

1000 elements took 7.67 ms per array

1e4 elements took 116 ms per array

1e5 elements took 3.28 s per array
1e6 elements took 225 s per array

Code: Select all

import time
import math
import random
def sorting(data):
   if len(data) <= 1:
      return data
   bucket = [data[0]]
   i = 1
   while i< len(data):
      j = 0
      kstep = 0
      step = int(math.log(len(bucket),4))
      sign = 1
      if data[i] >bucket[-1]:
         bucket += [data[i]]
         i+= 1
         continue
      if data[i] < bucket[0]:
         bucket.insert(0,data[i])
         i += 1
         continue
      while kstep <= step:
         if sign == 1:
            while data[i] > bucket[j]:
               j+= sign*4**(step -kstep)
               if j >= len(bucket):
                  j = len(bucket)-1
                  break
               if j < 0:
                  j = 0
                  break
            kstep += 1
            sign = -1*sign
         else:
            while data[i] < bucket[j]:
               j+= sign*4**(step -kstep)
               if j >= len(bucket):
                  j = len(bucket)-1
                  break
               if j < 0:
                  j = 0
                  break
            kstep += 1
            sign = -1*sign
      k = j
      bucket.insert(j,data[i])
      while k > 0 and bucket[k-1] > bucket[k]:
         bucket[k-1], bucket[k] = bucket[k] , bucket[k-1]
         k-=1
      while j<len(bucket) and bucket[j+1] < bucket[j]:
         bucket[j+1], bucket[j] = bucket[j] , bucket[j+1]
         j+= 1
      i+=1
   return bucket

def sortingrev(data):
   if len(data) <= 1:
      return data
   bucket = [data[0]]
   i = 1
   while i< len(data):
      j = 0
      kstep = 0
      step = int(math.log(len(bucket),4))
      sign = 1
      if data[i] < bucket[-1]:
         bucket += [data[i]]
         i+= 1
         continue
      if data[i] > bucket[0]:
         bucket.insert(0,data[i])
         i += 1
         continue
      while kstep <= step:
         if sign == 1:
            while data[i] < bucket[j]:
               j+= sign*4**(step -kstep)
               if j >= len(bucket):
                  j = len(bucket)-1
                  break
               if j < 0:
                  j = 0
                  break
            kstep += 1
            sign = -1*sign
         else:
            while data[i] > bucket[j]:
               j+= sign*4**(step -kstep)
               if j >= len(bucket):
                  j = len(bucket)-1
                  break
               if j < 0:
                  j = 0
                  break
            kstep += 1
            sign = -1*sign
      k = j
      bucket.insert(j,data[i])
      while k > 0 and bucket[k-1] < bucket[k]:
         bucket[k-1], bucket[k] = bucket[k] , bucket[k-1]
         k-=1
      while j<len(bucket) and bucket[j+1] > bucket[j]:
         bucket[j+1], bucket[j] = bucket[j] , bucket[j+1]
         j+= 1
      i+=1
   return bucket

def sortingigcase(data):
   if len(data) <= 1:
      return data
   bucket = [data[0]]
   i = 1
   while i< len(data):
      j = 0
      kstep = 0
      step = int(math.log(len(bucket),4))
      sign = 1
      tempdata = data[i].upper()
      tempbucket = bucket[-1].upper()
      if tempdata >tempbucket:
         bucket += [data[i]]
         i+= 1
         continue
      tempbucket = bucket[0].upper()
      if tempdata < tempbucket:
         bucket.insert(0,data[i])
         i += 1
         continue
      while kstep <= step:
         if sign == 1:
            tempbucket = bucket[j].upper()
            while tempdata > tempbucket:
               j+= sign*4**(step -kstep)
               if j >= len(bucket):
                  j = len(bucket)-1
                  tempbucket =bucket[j].upper()
                  break
               if j < 0:
                  tempbucket =bucket[0].upper()
                  j = 0
                  break
               tempbucket =bucket[j].upper()
            kstep += 1
            sign = -1*sign
         else:
            while tempdata < tempbucket:

               j+= sign*4**(step -kstep)
               if j >= len(bucket):
                  j = len(bucket)-1
                  tempbucket = bucket[j].upper()
                  break
               if j < 0:
                  j = 0
                  tempbucket = bucket[j].upper()
                  break
               tempbucket = bucket[j]
            kstep += 1
            sign = -1*sign
      k = j
      bucket.insert(j,data[i])

      tempbucketi1 = bucket[k].upper()
      tempbucketi2 = bucket[k-1].upper()
      while k > 0 and  tempbucketi1 < tempbucketi2:
         bucket[k-1], bucket[k] = bucket[k] , bucket[k-1]
         k-=1
         tempbucketi1 = bucket[k].upper()
         tempbucketi2 = bucket[k-1].upper()
      k = j
      tempbucketi1 = bucket[k].upper()
      tempbucketi2 = bucket[k+1].upper()
      while k < len(bucket) and  tempbucketi1 > tempbucketi2:
         bucket[k+1], bucket[k] = bucket[k] , bucket[k+1]
         k+= 1
         tempbucketi1 = bucket[k].upper()
         tempbucketi2 = bucket[k+1].upper()
      i+=1

   return bucket

def sortingrevigcase(data):
   if len(data) <= 1:
      return data
   bucket = [data[0]]
   i = 1
   while i< len(data):
      j = 0
      kstep = 0
      step = int(math.log(len(bucket),4))
      sign = 1
      tempdata = data[i].upper()
      tempbucket = bucket[-1].upper()
      if tempdata <tempbucket:
         bucket += [data[i]]
         i+= 1
         continue
      tempbucket = bucket[0].upper()
      if tempdata > tempbucket:
         bucket.insert(0,data[i])
         i += 1
         continue
      while kstep <= step:
         if sign == 1:
            tempbucket = bucket[j].upper()
            while tempdata < tempbucket:
               j+= sign*4**(step -kstep)
               if j >= len(bucket):
                  j = len(bucket)-1
                  tempbucket =bucket[j].upper()
                  break
               if j < 0:
                  tempbucket =bucket[0].upper()
                  j = 0
                  break
               tempbucket =bucket[j].upper()
            kstep += 1
            sign = -1*sign
         else:
            while tempdata > tempbucket:

               j+= sign*4**(step -kstep)
               if j >= len(bucket):
                  j = len(bucket)-1
                  tempbucket = bucket[j].upper()
                  break
               if j < 0:
                  j = 0
                  tempbucket = bucket[j].upper()
                  break
               tempbucket = bucket[j]
            kstep += 1
            sign = -1*sign
      k = j
      bucket.insert(j,data[i])

      tempbucketi1 = bucket[k].upper()
      tempbucketi2 = bucket[k-1].upper()
      while k > 0 and  tempbucketi1 > tempbucketi2:
         bucket[k-1], bucket[k] = bucket[k] , bucket[k-1]
         k-=1
         tempbucketi1 = bucket[k].upper()
         tempbucketi2 = bucket[k-1].upper()
      k = j
      tempbucketi1 = bucket[k].upper()
      tempbucketi2 = bucket[k+1].upper()
      while k < len(bucket) and  tempbucketi1 < tempbucketi2:
         bucket[k+1], bucket[k] = bucket[k], bucket[k+1]
         k+= 1
         tempbucketi1 = bucket[k].upper()
         tempbucketi2 = bucket[k+1].upper()
      i+=1

   return bucket

def sort(data, choice):
   """options: 1 standard lower to higher sort, 2 reverse sort, 3 ignore the case of strings and standard sort, 4 ignore the case of strings and  reverse sort """
   if choice == 1:
      return sorting(data)
   if choice == 2:
      return sortingrev(data)
   if choice == 3:
      return sortingigcase(data)
   if choice == 4:
      return sortingrevigcase(data)

Verification program, also used for time measurements

Spoiler (Show/Hide)

Code: Select all

import copy
import time
import random
from itertools import pairwise, starmap

import s01_phillip1882 as s01
import s02_phillip1882 as s02
import s03_phillip1882 as s03
import s04_phillip1882 as s04


def wrap_inplace(f):
    def sort(data):
        duplicate = copy.copy(data)
        f(duplicate)
        return duplicate
    return sort


def wrap_s04_choice(data):
   return s04.sort(data, 1)


CANDIDATES = {
    's01': s01.sorting,
    's02': wrap_inplace(s02.sorting),
    's03': s03.sorting,
    's04': wrap_s04_choice,
    'py': sorted,
}

EXAMPLES = [
    [],
    [1],
    [1, 1, 1],
    [1, 2, 3],
    [3, 2, 1],
    [8, 4, 7, 3, 9, 2, 7, 5],
    [2, 1, 1, 1, 2, 1, 2, 2, 1, 2],
]

POPULATION = list(range(4))
SIZES = [100, 1000, 10000]


def all_equal(size):
    return [1] * size


def ascending(size):
    return list(range(0, size))


def descending(size):
    return list(range(size, 0, -1))


def permutation(size):
    keys = list(range(size))
    random.shuffle(keys)
    return keys


def uniform(size):
    maximum = size - 1
    return [random.randint(0, maximum) for i in range(size)]


def uniform_few(size):
    return random.choices(POPULATION, k=size)


def expose_25(size):
    threshold = 25
    while threshold < size:
        threshold *= 25
    threshold //= 25
    rest = max(1, size - threshold)
    return [1] * threshold + [2] * rest


def median3_killer(size):
    half = (size - 1) // 2
    return [
        half + i - 1 if i % 2 else i for i in range(0, half)
    ] + [
        (i - half) * 2 + 1 for i in range(half, size)
    ]


def beta(alpha, beta):
    def distribution(size):
        return [
            round(random.betavariate(alpha, beta) * size)
            for i in range(size)
        ]
    distribution.__name__ = 'beta({}, {})'.format(alpha, beta)
    return distribution

skew_low = beta(.9, 3)
skew_high = beta(3, .9)
skew_both = beta(.5, .5)
skew_mid = beta(4, 4)


GENERATORS = [
    all_equal, ascending, descending, permutation, uniform, uniform_few,
    expose_25, median3_killer,
    skew_low, skew_high, skew_both, skew_mid,
]


class AnnotatedValue(object):
    def __init__(self, value, index):
        self.value = value
        self.index = index

    def __lt__(self, other):
        return self.value < other.value

    def __le__(self, other):
        return self.value <= other.value

    def __eq__(self, other):
        return self.value == other.value

    def __ne__(self, other):
        return self.value != other.value

    def __gt__(self, other):
        return self.value > other.value

    def __ge__(self, other):
        return self.value >= other.value

    def before(self, other):
        if self.value < other.value:
            return True
        if other.value < self.value:
            return False
        return self.index < other.index

    def __str__(self):
        return str((self.index, self.value))


def annotated(sequence):
    return [AnnotatedValue(v, i) for i, v in enumerate(sequence)]


def before(left, right):
    return left.before(right)


sequence_fmt = '{} failed on {}'.format
generator_fmt = '{} failed on {} {}'.format
validation_msg = 'Need to provide either a sequence or both a generator and a size'
stable_fmt = '{} passed in {} ms and is stable'.format
passed_fmt = '{} passed in {} ms'.format


def verify(f, name, sequence=None, generator=None, size=None):
    if sequence is not None:
        message = sequence_fmt(name, sequence)
    elif generator and size:
        sequence = generator(size)
        message = generator_fmt(name, generator.__name__, size)
    else:
        raise ArgumentError(validation_msg)

    try:
        assert f(sequence) == sorted(sequence)
        return True
    except:
        print(message)
        return False


def check_solution(name, f):
    start = time.perf_counter_ns()
    for sequence in EXAMPLES:
        if not verify(f, name, sequence=sequence):
            return
    for size in SIZES:
        for generator in GENERATORS:
            if not verify(f, name, generator=generator, size=size):
                return
    sequence = annotated(uniform_few(1000))
    result = f(sequence)
    end = time.perf_counter_ns()
    elapsed_ms = round((end - start) / 1e6)
    if all(starmap(before, pairwise(result))):
        print(stable_fmt(name, elapsed_ms))
    else:
        print(passed_fmt(name, elapsed_ms))


def main():
    state = random.getstate()
    for name, f in CANDIDATES.items():
        check_solution(name, f)
        random.setstate(state)


if __name__ == '__main__':
    main()

Update 2023-03-02: submissions that came in between the original deadline and the extended deadline of March 1st.

,

and

awards are "virtual"; they would have been awarded if these solutions had been submitted before the original deadline. The original four solutions would have gotten fewer of those rewards as a result. I have performed new measurements on both the old and the new solutions, but I have not updated the numbers in the old solutions in order to preserve consistency in their original

awards. The new timings are based on uniform distributions only.

5. Heapsort by phillip1882

Spoiler (Show/Hide)

This is a fairly regular heapsort. There is a very slight mistake in the algorithm, which causes the first two elements to be swapped sometimes. phillip1882 addressed this by making a final insertion sort pass, which adds O(N) comparisons.

Awards and complexity

passes all tests

supports arbitrary data types thanks to Python's __lt__
comparisons: 2N log₂ N + N
moves: 6N log₂ N
additional memory: N
cache misses: 2(N log₂ N)/c
10 elements took 36.1 µs per array
100 elements took 519 µs per array
1000 elements took 8.52 ms per array
1e4 elements took 119 ms per array
1e5 elements took 1.41 s per array
1e6 elements took 19.3 s per array

Code: Select all

def sorting(data):
   if len(data)<=1:
      return data
   for i in range(1,len(data)):
      j = i
      while j > 0 and data[j] > data[(j-1)//2]:
         data[j],data[(j-1)//2] = data[(j-1)//2], data[j]
         j = (j-1)//2

   for i in range(len(data)-1,0,-1):
      j = 0
      data[i],data[j] = data[j],data[i]
      while 2*j+2 < i:
         child1 = 2*j+1
         child2 = 2*j+2
         if data[child1] > data[child2]:
            if data[j] < data[child1]:
               data[child1],data[j] =data[j],data[child1]
               j = 2*j+1
            else:
               break
         else:
            if data[j] < data[child2]:
               data[child2],data[j] =data[j],data[child2]
               j = 2*j+2
            else:
               break
   for i in range(0,len(data)):
      j = i
      while j>0 and data[j] < data[j-1]:
         data[j],data[j-1]=data[j-1],data[j]
         j-=1
   return data

6. Shellsort by phillip1882

Spoiler (Show/Hide)

Shellsort is like insertion sort, but making large jumps initially. The jumps are reduced until a regular insertion sort with "jumps" of step 1 remains. This is similar to the relation between Combsort and Bubblesort. phillip1882 used an original step size sequence based on division by Euler's number e.

At first sight, one might expect any Shellsort to be O(N log N). Unfortunately, it is not so simple because elements might still need to travel distances greater than the previous step size during each pass. My training in algorithm analysis is not strong enough to determine the complexity exactly, so I set lower and upper bounds instead.

Awards and complexity

passes all tests

supports arbitrary data types thanks to Python's __lt__

original gap size sequence
comparisons: at least N ln N, at most N(N-1)/2
moves: at most 3N(N-1)/2
additional memory: N
cache misses: N(N-1)/2c² (guess)

10 elements took 29.0 µs per array

100 elements took 385 µs per array
1000 elements took 7.38 ms per array
1e4 elements took 118 ms per array
1e5 elements took 1.44 s per array
1e6 elements took 17.9 s per array

Code: Select all

def sorting(data):
   if len(data) <= 1:
      return data
   if len(data) == 2:
      if data[0] > data[1]:
          data[0], data[1] = data[1], data[0]
      return data
   e = 2.718
   step = int(len(data)/e)
   check = False
   while step > 0:
      for i in range(step, len(data)):
         j = i
         while j-step > -1 and data[j] < data[j-step]:
            data[j], data[j-step] = data[j-step], data[j]
            j -= step
      step = int(step/e)
      if not check and step == 1:
         check = True
         continue
      elif check:
         break
      elif step == 0:
         step = 1
         check = True
   return data

7. Bottom-up mergesort by phillip1882

Spoiler (Show/Hide)

Bottom-up mergesort first cuts the input into pairs and sorts each pair. Then, adjacent pairs are merged, creating sorted subsequences of size 4. This continues upwards until the entire input is sorted. phillip1882 exclusively used single-pass lists that never get modified; this gave the algorithm great cache locality at the expense of using more memory.

Awards and complexity

passes all tests

supports arbitrary data types thanks to Python's __lt__
comparisons: N log₂ (N+1)
moves: N log₂ (N+1)
additional memory: N log₂ N
cache misses: (N log₂ (N+1))/c
10 elements took 43.7 µs per array
100 elements took 494 µs per array
1000 elements took 6.41 ms per array
1e4 elements took 82.5 ms per array
1e5 elements took 1.07 s per array
1e6 elements took 13.2 s per array

Code: Select all

def merge(left,right):
   bucket = []
   i = 0
   j = 0
   while True:
      if i == len(left):
         bucket += right[j:]
         return bucket
      if j == len(right):
         bucket += left[i:]
         return bucket
      if left[i] < right[j]:
         bucket += [left[i]]
         i+=1
      else:
         bucket += [right[j]]
         j+= 1

def sorting(data):
   if len(data) <=1:
      return data
   array = []
   for i in range(0,len(data),2):
      if len(data)-i == 1:
         array += [data[i]]
      else:
         array += merge([data[i]],[data[i+1]])
   data =array[:]
   array = []
   i = 0
   step = 2
   while step < len(data):
      while i < len(data):
         array += merge(data[i:i+step],data[i+step:i+2*step])
         i += step*2
      i = 0
      step *= 2
      data =array[:]
      array = []
   return data

8. Randomized dual pivot auxiliary memory quicksort by phillip1882

Spoiler (Show/Hide)

"Plain" quicksort picks a pivot element, moves all elements that are less than the pivot to the beginning of the sequence (in-place) and then recursively sorts both subranges. This variant picks two pivots and creates three subranges. The pivots are chosen at random. Instead of partitioning the ranges in place, the subranges are copied to auxiliary memory.

Awards and complexity

passes all tests

supports arbitrary data types thanks to Python's __lt__
comparisons: average N log_{1.8 (?)} N, worst N²
moves: average N log_{1.8 (?)} N, worst N²
additional memory: average (N+1) log_{1.8 (?)} N, worst N²
cache misses: average 2(N log_{1.8 (?)} N)/3c, worst 2(N²)/c

performs O(N) pseudorandom number generations
10 elements took 50.9 µs per array
100 elements took 553 µs per array

1000 elements took 6.40 ms per array

1e4 elements took 73.8 ms per array

1e5 elements took 890 ms per array

1e6 elements took 10.6 s per array

Code: Select all

import random
def sorting(data):
   size = len(data)
   if size <= 1:
      return data
   if size == 2:
      if data[0] > data[1]:
         data[0], data[1] =data[1], data[0]
      return data
   pivot1 = random.randint(0,size-2)
   pivot2 = random.randint(pivot1+1, size-1)
   if data[pivot1] > data[pivot2]:
      pivot1, pivot2 = pivot2, pivot1
   left = []
   middle = []
   right = []
   if data[pivot1] == data[pivot2]:
      check = False
      for i in range(0,size):
         if i == pivot1:
            continue
         if data[i] <data[pivot1]:
            left += [data[i]]
         else:
            right += [data[i]]
            if i+1 < size and data[i] != data[i+1]:
               check = True
      if left == []:
         if check:
            return [data[pivot1]] +sorting(right)
         else:
            return [data[pivot1]]+right[:]
      else:
         return (sorting(left)+[data[pivot1]]+sorting(right))
   else:
      for i in range(0,size):
         if data[i] <data[pivot1]:
            left += [data[i]]
         elif data[i] < data[pivot2]:
            middle += [data[i]]
         else:
            right += [data[i]]
      if left == []:
         return sorting(middle) +sorting(right)
      else:
         return sorting(left) +sorting(middle) +sorting(right)

9. "Smooth heap sort" by phillip1882

Spoiler (Show/Hide)

This is another original algorithm by phillip1882. At first sight, it appears to be a modification of his heapsort, since it also ends with a "repair insertion sort". However, the main algorithm is quite different. Unfortunately, I am currently not awake enough to fully understand it. I invite phillip1882 to explain it. Based on the loop structure, I'm guessing the complexity is somewhere in the vicinity of O(N log N log log N).

Awards and complexity

passes all tests

supports arbitrary data types thanks to Python's __lt__

original algorithm
comparisons: O(N log N log log N)?
moves: O(N log N log log N)?
additional memory: N
cache misses: O(N log N log log N)/c?
10 elements took 46.4 µs per array
100 elements took 1.10 ms per array
1000 elements took 32.6 ms per array
1e4 elements took 766 ms per array

Code: Select all

import math
def sorting(data):
   if len(data) <= 1:
      return data
   cycle = int(math.log(len(data),2))
   if cycle < 1:
      cycle = 1
   for v in range(0,cycle):
      for i in range(1,len(data)):
         j = 0
         k = i
         while k != j:
            child1 = k-1
            child2 = (j+k-1)//2
            if data[child1] < data[child2]:
               if data[k] < data[child2]:
                  data[k], data[child2] = data[child2], data[k]
                  k = child2
               else:
                  break
            else:
               if data[k] < data[child1]:
                  if data[k] <data[child2]:
                     data[k],data[child2] = data[child2], data[k]
                     if child1 != 0:
                        data[child1],data[k] = data[k],data[child1]
                     k -= 1
                     j =child2
                  else:
                     data[k], data[child1] = data[child1], data[k]
                     k -= 1
                     j = child2
               else:
                  break
   for i in range(0,len(data)):
      j = i
      while j >0 and data[j]< data[j-1]:
         data[j], data[j-1]= data[j-1], data[j]
         j -=1
   return data

phillip1882 · Post by **phillip1882** » Fri Jan 06, 2023 6:25 pm

i still was the only one that submitted code? come on guys. you had plenty of time to try writing a sort, and they can be easy to write.
smooth heap sort

Spoiler (Show/Hide)

so my program tries to combine elements of smooth sort and heap sort into one algorithm. it compares the adjacent index, and a distant index to see which is larger, then swaps the current index if its smaller. then the index is changes to the index it swaps with.it does this log₂(len(array) times, giving it a complexity of N(log_n)^2
e5 2.11 sec
e6 35.16 sec
oh, i wanted to point out it uses hardly any additional memory. only about 3 values are stored at any given time.
also my shell sort and heap sort dont use much.

Jplus · Post by **Jplus** » Fri Jan 06, 2023 9:18 pm

Yes, you were the only one who didn't forget.

2023-03-07 Edit to add: response to updated comments in previous post.

You are right that your heapsort, shellsort and smooth heapsort only take constant additional memory intrinsically. However, in the original challenge rules, I stated that I would pretend that in-place algorithms copy the input sequence to a new container first. I did that in order to make life easier for participants using programming languages that favor immutability, such as Haskell. In the end, nobody did that, but I could not know that beforehand. It does not matter much for the grades, anyway; the constant additional memory solutions still get the stars they deserve, since you cannot go less than N.

As for the time recordings you made for smooth heapsort with > 1e4 elements: unfortunately, those cannot be compared with time measurements made on my computer. It is clear that it is among the slower algorithms, though, so it does not earn rockets in any case.

phillip1882 · Post by **phillip1882** » Sat Jan 07, 2023 1:43 am

yeah for my third program, i found base 4 faster than base 5, and base 3 about the same time as 4. i didn't check base 2, but assume its slower.

commodorejohn · Post by **commodorejohn** » Sat Jan 07, 2023 1:44 am

I meant to try that one, but I got caught up in other things :/ Next one for sure though ;)

Jplus · Post by **Jplus** » Sat Jan 07, 2023 9:47 am

commodorejohn wrote: ↑Sat Jan 07, 2023 1:44 am I meant to try that one, but I got caught up in other things :/ Next one for sure though

If you want, you can still submit a solution for round 1 and get it graded.

phillip1882 · Post by **phillip1882** » Sun Jan 15, 2023 10:42 pm

i'm having some trouble with the emoji in round 1 results, i see a broken image pic next to each emoji

Jplus · Post by **Jplus** » Mon Jan 16, 2023 11:57 pm

I see it, too. That was not the case when I posted it. Maybe a problem with the custom font used by phpBB?

Freddino18 · Post by **Freddino18** » Sat Jan 21, 2023 7:18 am

Probably. Get the squirrel on it. I see it only on colored squares though. Maybe it's an error on the user side?

phillip1882 · Post by **phillip1882** » Sat Jan 21, 2023 5:57 pm

I was thinking of writing more sorting algorithms just for the giggles. I'm very bored.

New XKCD Forums

Programming challenges

Programming challenges

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Round 1: sorting algorithms

Re: Programming challenges

Re: Round 1: sorting algorithms

Re: Round 1: sorting algorithms

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Round 1 results

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges

Re: Programming challenges