Why you should know Bresenham's line-drawing algorithm

Why you should know Bresenham’s line-drawing algorithm

I have a soft spot for this algorithm because it’s the first one I actually learned. My father and a friend released a graphics library for the Zenith Z-100 series of computers using Turbo Pascal, and as a result, when I mentioned wanting to make a line that went smoothly towards a point, he could hand me a page and a half of pascal code for me to study and contemplate. I eventually converted it to about 10 lines of C which was way too pointer-heavy to be a good example of anything, but the algorithm stuck with me.

So I want to talk a bit about why this is such a pretty algorithm, and also why it’s still useful to you even if you never find yourself wanting to write single pixels at a time in a frame buffer.

The basic idea is this: You want to draw a line between two points on a grid. Here’s a few ways that can go:

Fastest Approach

............
....******B.
...*........
..*.........
.A..........
............

This one’s easy; start at the initial point, and for each step, increment any coordinates that haven’t reached the desired value, and draw a point.

Slowest Approach

............
..........B.
.........*..
........*...
.A******....
............

Slightly harder; reduce whichever delta is highest, or both if they’re equal.

Smooth and Pretty

............
........**B.
......**....
...***......
.A*.........
............

This looks better, but how do you do it? Well, that’s where the fancy and clever line drawing algorithm comes in.

Understanding the algorithm

Let’s start with a bit of background on what we’re trying to do, and how we’re exploring it.

Without loss of generality

In this article, I only talk about cases where the X coordinate has more delta to cover, and where both deltas are positive. There’s actually eight cases (sign of $dx$ and $dy$, and their relative magnitudes), but they’re equivalent. Similarly, we can simply ignore the starting coordinates, and treat everything as starting at $(0, 0)$, because you can figure out how to add constant offsets to numbers. So while in theory we’re drawing a line from $(x, y)$ to $(x’, y’)$, actually we care only about a set of points starting at $(0, 0)$ and ending at $(dx, dy)$, which contains $dx$ total points. We can refer to each point as $(x_n, y_n)$, where $(x_0, y_0)$ is actually just $(0, 0)$, and $(x_{dx}, y_{dx})$ is actually just $(dx, dy)$. (In fact, for all $i$ in $[0, dx]$, $x_i$ is just $i$.)

What makes a good line?

So, let’s talk about what we want. For a line-drawing algorithm, there are several desirable traits. They are:

The line drawn should be as close as possible to the theoretical line we would draw on an infinite-resolution device.
The line from A to B should be identical to the line from B to A.
The line should be deterministic.
The line shouldn’t have a persistant bias.
The line should be as short as possible.

And as is so often the case, it’s really easy to see that you can’t have all of these. Consider a very short line:

Shortest Interesting Case

..... ..... .....
...B. ..#B. ...B.
.A... .A... .A#..
..... ..... .....

There’s two possible points to pick between A and B, and no matter how you pick, you’ll lose one of these properties. You can’t get around that.

The basic concept

For each point, we know what $x$ should be, and we want to find the best value of $y$. The theoretically correct value of $y$ is going to be $x\cdot{dy}/dx$. For instance, if $dx$ is 7, and $dy$ is 5, the correct value for $y_2$ is ${(2\cdot{5})/7}$. This admits a naive implementation: We maintain an accumulated error – deviation from the idealized target. With each step, we add $dy$ to it. When it reaches or exceeds $dx$, we increment $y$ and drop our error counter by $dx$.

So, for the 5x7 case:

..........
........B.
..........
..........
..........
..........
.A........
..........

We have $x_0 = 0$, $y_0 = 0$, $error = 0$. Applying the algorithm once yields $x_1 = 1$, $y_1 = 0$, $error = 5$:

..........
........B.
..........
..........
..........
..........
.A*.......
..........

Applying it again gets us $x_2 = 2$, $y_2 = 1$, $error = 3$:

..........
........B.
..........
..........
..........
...*......
.A*.......
..........

Applying it again gets us $x_3 = 3$, $y_3 = 2$, $error = 1$:

..........
........B.
..........
..........
....*.....
...*......
.A*.......
..........

Applying it again gets us $x_4 = 4$, $y_4 = 2$, $error = 6$ (no increment to $y$ this time):

..........
........B.
..........
..........
....**....
...*......
.A*.......
..........

The last three steps get us $error$ of 11 (reduces to 4 after incrementing $y$), 9 (reduces to 2 after incrementing $y$), and 7 (reduces to 0 after incrementing $y$).

..........
........B.
.......*..
......*...
....**....
...*......
.A*.......
..........

This is nice, but it can be made a bit smarter. In particular, this always maximally delays the first vertical step. In some cases it might be more accurate to take that step earlier.

So for the actual algorithm, we start our error value off at $2dy - dx$, and then use $2dy$ and $2dx$ as revised $dy$ and $dx$ values for the increment and decrement steps. Conceptually, this avoids having the algorithm always start at zero error. If the result is positive, the algorithm will potentially take steps slightly sooner; if it’s negative, the algorithm may take steps slightly later. For instance, in the fancy version, we start with $error = 3$, incrementing by 10 per step and testing against/decreasing by 14. The steps are then:

            x y error
..........  0 0 3
........B.  1 0 13
.......*..  2 1 9
.....**...  3 2 5
....*.....  4 3 1
...*......  5 3 11
.A*.......  6 4 7
..........  7 5 3

To understand why the starting error should be $2dy - dx$, remember that we’re in one octant of the possible space; we’re in the octant going from a horizontal line to the right, to a 45-degree angle line going up and right. At the bottom of this, $dy$ is 0. At the top of it, $dy = dx$. The middle of the segment is the line where $2dy = dx$, and for that line, we start our accumulated error at 0. As we move down from that, $2dy$ becomes less than $dx$, and our starting error becomes negative, pushing the steps at which we increment $y$ later by up to half a step when $dy = 0$. As we move up from it, $2dy$ becomes greater than $dx$, moving the steps at which we increment earlier, ultimately by up to half a step (at $dy = dx$, where $2dy - dx$ is just $dx$.) The switch from using $dx$ and $dy$ to $2dx$ and $2dy$ for the main body of the algorithm just gives us better effective resolution on this.

Okay, but if you don’t want to draw pixelated lines, why do you care?

This is equivalent to a lot of other things

The actual purpose of this algorithm is to spread events out evenly. We have $S$ total steps, and $N$ events, where $N <= S$, and we want the events to happen evenly spread throughout the steps. In the original case, the event was “a change in $y$ coordinate while drawing lines”, but that doesn’t matter. We spend a lot of time in programming wanting to spread events out, and most ways to do this are more expensive and less reliable than this algorithm.

Imagine that you want to divide work items evenly into queues, but they may not divide evenly. For instance, maybe you have 8 items and 5 queues. You can do this as ${2, 2, 2, 1, 1}$, and it’ll be fine, but it’s got a bias towards earlier positions that you might prefer to avoid. Wouldn’t it be nicer to have something that looks like ${2, 1, 2, 1, 2}$?

So, what you really need to do is assign the extras to 3/5 of the spaces. What we need is an algorithm that can let us decide in which 3 of 5 positions to do something, spreading them out “evenly”. Does this sound familiar? It should. Because this is exactly the same as asking on which three of five steps our $y$ coordinate increments.

Maybe you want to do something occasionally, in a predictable and evenly spaced way. For instance, maybe you want to do something 3/10 of the time, but you don’t want to do exactly three in a row every ten operations, you want to spread them out, and you don’t want them to be random, or clustered, or sometimes do only 2/10 and sometimes do 4/10 (which is what might happen using an RNG to do it).

These are all equivalent, and this algorithm can produce nice smooth results for any of them. To select steps with probability $p$ (such that $0 <= p <= 1$), express $p$ as a rational number, $a/b$, and do this algorithm using $a$ as $dy$ and $b$ as $dx$.

If you want to shuffle things a bit, and change which steps are which, you can cheat by using arbitrary values as your starting value for $error$.

You can also use this as a more general “occasionally but not all the time” generator. A single shared set of $dx$, $dy$, and $error$ values can be used to generate events at a given probability in a way cheaper than using a PRNG. It’s not cryptographically secure, but it’s also less vulnerable to the clusters of events that real random streams generate, and which you sometimes don’t want. In the degenerate case, where $dy$ is 1, this is just a counter which ticks over every $dx$ steps; what makes it interesting is that with higher $dy$, you can have it pick arbitrary frequencies, not just frequencies that are the reciprocals of integers.

Constant-time lookup

Okay, but maybe you’ve spotted the problem here: While this is good for iterative cases, where you’re taking all the steps, it’s not good for answering the question “should item 1,732 in this list be one of the steps that gets an extra operation”. Well, good news, you can do the computation trivially that way. The question of whether the accumulated error sum has exceeded $dx$ at this step is identical to the question of whether the accumulated error sum modulos $dx$ is less than $dy$.

To see why this is the case: It is an invariant of the algorithm that $error < dx$. When $error$ matches or exceeds $dx$ during a step, it started under $dx$. Since $error - dx < 0$, it follows that $error - dx + dy < dy$. On steps when we don’t wrap, though, we know that our previous error was non-negative, and thus $error + dy >= dy$. (There’s an exception to this for very early steps if $dx > 2dy$ when using the fancy starting error computation. For the constant-time lookup case, you can use $2dy + dx$ as a starting value, to avoid any unpleasantness from wondering how a given language handles modulos on negative values.)

So for instance, given a starting error of $error$, on step 1,732, we check whether $(error + 1732dy) \bmod dx < dy$. If it is, then that step is one of the ones where $y$ changed.

Why you should know Bresenham's line-drawing algorithm

2025/05/30