A while – as in, several years – ago, I came across a cool blog post by Roger Alsing (this one) where he describes how he re-painted the Mona Lisa using a limited number of polygons and an evolutionary algorithm. This post was picked up by quite a few people and inspired many other posts where people showed their own take on the problem (see here for a forum thread with links to several other realisations of such algorithms). I wondered if such an algorithm could me made to converge more quickly, and if including shapes with colour gradients would mean that you’d need fewer polygons to get a result of the same quality. So, I whipped up some code that allowed me to play around with this. I have used C++ and OpenGL for this project, as the fast drawing afforded by OpenGL greatly speeds up the whole thing (a while ago, I originally made something in Python which was incredibly slow – so I quickly switched over).
In this project, I have mostly used a bunch of images I found in Google image search, so if any of the images posted here is yours and you would prefer me to omit it, drop me a line.
The objective here is to replicate an image, as accurately as possible, using a limited number of triangles (say, 200 of them) to draw it. These triangles can have varying transparencies and colours, and can be placed in different ‘orders’ (meaning which triangles show up in front of which other ones). In my case, I have also allowed the triangles to have per-vertex colours and alpha values, enabling smooth colour gradients to be replicated by a single triangle. The quality of an image drawn using a bunch of triangles is measured by comparing it to the original input image on a pixel-by-pixel basis, and summing the squared differences of pixel brightness in all R,G,B colour channels over the entire image. The lower this total score, the better the triangle image approximates the original one.
The basic concept of the evolutionary algorithm used here is that we start off with a random configuration of triangles (which, taken together, likely looks nothing like the original image), and that we let this configuration ‘jiggle’ a bit by varying a number of the parameters for its triangles (vertex locations, colours, transparencies, ordering). We then check if what we get is better than what we had before – if so, we use the new version and jiggle that around. If not, we try jiggling it again. To improve the efficiency of this (blind) process, we use a ‘population’ of images: we don’t just have one single image made by a bunch of triangles, we have a population of slightly different ‘jigglings’ of the same set of triangles. The best jiggling wins, and is selected to provide a new generation of derived jiggled images from which we then choose the best. We continue this process until we are satisfied with the result. In this sense, you could see it as a genetic algorithm which only has one parent per generation and small mutations to its children.
The disadvantage of blind mutation
Letting the algorithm develop the image blindly means that is converges very slowly: it gets progressively more unlikely that a random variation on the best current result produces anything that is better. This is because there are many things that change simultaneously with each jiggling: all vertices of all triangles can undergo changes, so a favourable mutation is likely to be accompanied by unfavourable ones. Also, generally if a triangle is placed in an area of the image where it does not contribute positively to image quality it has no way of suddenly appearing in a different area where it might be beneficial (the jiggling amplitude is too small for that), and it is likely to become ever more transparent until it effectively plays no role anymore. So it is likely that there are ways in which we can improve upon this blind process: we can restrict the number of changes we allow for each generation (limiting ourselves to changing only one triangle for instance), we can change the population size (number of derived images from the parent), and we can place limits on how strongly the triangles can mutate. We can even allow these limitations to vary over the course of the whole approximation process.
The animation shown here illustrates one way in which we initially restrict the allowed mutations: each sub-cell of the image has 8 triangles in it, which initially are only allowed to vary within that cell – but we vary 10 random triangles at the same time. As this first phase of the algorithm converges towards a stationary quality, we limit the number of triangles which we simultaneously mutate to 2. Further in, we relax the cell restriction and allow triangles to vary in size over the entire image. Even so convergence progresses rapidly at first and then slows down to a crawl. Evidently there is much we can still improve in this algorithm to speed it up a bit. By the way, note that the animation above only uses uniformly coloured triangles: no per-vertex colouring or alpha blending is used here yet. These improvements will be the subject of a future blog post – so watch this space.