Algorithmic Composition: linear

Showing posts with label linear. Show all posts

Monday, September 29, 2008

A simple probabilistic decision

In all of the examples of autonomous computer decision making presented up to this point, we've used equal probabilities for all the possible choices, using the random object or its audio counterpart noise~. The resultant choices have thus been truly arbitrary, within the limits prescribed by the program.

It's also possible to use random number generation to enact decisions that are still arbitrary but are somewhat more predictable than plain randomness. One can apply a probability factor to make one choice more likely than another in a binary decision, or (as will be demonstrated in a future lesson) one can apply a more complicated probability function to a set of possibilities, giving each one a different likelihood. This lesson will demonstrate the first case, using a probability factor to make a binary decision in which one result is more likely to occur than the other.

As described briefly in an earlier lesson on randomness, the probability of a particular looked-for event occurring can be defined as a number between 0 and 1 inclusive, with that number being the ratio of the number of looked-for outcomes divided by the number of all possible outcomes. For example, the probability of choosing the the ace of spades (1 unique looked-for result) out of all possible cards in a deck (52 of them) is 1/52, which is 0.019231. The probability of not choosing the ace of spades is 1 minus that, which is 0.980769 (51/52). Thus we can say that the likelihood of choosing the ace of spades at random from a deck of cards is less than 2%, and the likelihood of not choosing it is a bit more than 98%; other ways of stating this are to say that there is a 1 in 52 chance of getting the ace of spades, or to say that the odds against choosing the ace of spades are 51 to 1.

For making an automated decision between two things, we can emulate this sort of odds by applying a probability factor between 0 and 1 to one of the choices. For example, if we want to make a decision between A and B, with both being equally likely, we would set the probability of A to 0.5 (and thus the probability of B would implicitly be 1 minus 0.5, which also equals 0.5). If we want to make A somewhat more likely than B, we could set the probability of A to something greater than 0.5, such as 0.75. This would mean there is a 75% chance of choosing A, and a 25% chance of choosing B; the odds in favor of choosing A are 3 to 1 (75:25). This does not ensure that A will always be chosen exactly thrice as many times as B. It does mean, however, that as the number of choices increases, statistically the percentage of A choices will tend toward being three times that of B.

The simplest way to do this in a computer program is as follows: Set the probability P for choice A. Choose a random number x between 0 and 1 (more specifically, 0 to just less than 1). If x is less than P, then choose A; otherwise, choose B. The result of such a program will be that over the course of numerous choices, the distribution of A choices over the total number of choices will tend toward P. The choices will still be arbitrary, and we can't predict any individual choice with certainty (unless the probability of A is either 0 or 1), but we can characterize the statistical probability of choosing A or B.

Because the random generator in Max produces whole numbers less than the specified maximum rather than fractional numbers between 0 and 1, we have to do one additional step: we either have to map the range of random numbers into the fractional 0-to-1 range, or we have to map the probability factor into the range of whole numbers. It turns out to be slightly more efficient to do the latter, because it requires doing just one multiplication when we specify the probability, rather than doing a division every time we generate a random number. The following tiny program does just that. Every time it receives a message in its inlet, it will make a probabilistic choice between two possible results, based on a provided probability factor.

The probability value that goes into the number box (either by entering the number directly or by it coming from the right inlet or the patcherargs obejct) gets multiplied by 1,000,000 and the result is stored in the right inlet of the less than object. The bang from the button (triggered either by a mouse click or by a message coming in the left inlet) causes random to generate a random number from 0 to 999,999. If it is less than the number that came from the probability factor (which would be 450,000 in the above example), it sends out a 1, otherwise it sends out a 0. You can see that statistically, over the course of many repetitions, it will tend to send out 1 about 45% of the time.

A useful programming trick that is invisible in this picture is that the number box has been set to have a minimum value of 0 and a maximum value of 1. Any value that comes in its inlet will be clipped to that range before being sent out, so the number box actually serves to limit the range and prevent any inappropriate probability values that might cause the program to malfunction. Protecting against expected or unwanted values, either from a user or from another part of the program, is good safe programming practice.

As was mentioned earlier, the multiplication of the probability by 1,000,000 is because we want to express the probability as a fraction from 0 to 1, but random only generates random whole numbers, so we need to reconcile those two ranges. We chose the number 1,000,000 because that means that we can express the probability to as many as six decimal places, and when we multiply it by 1,000,000 the result will always be a unique integer representing one of those 1,000,001 possible values (from 0.000000 to 1.000000 inclusive). Since six decimal places is the greatest precision that can be entered into a standard Max object, this program takes full advantage of the precision available in Max. It's inconceivable that you could create a situation in which you could ever hear the difference between a 0.749999 probability and a 0.750000 probability (indeed, in most musical situations it's doubtful you could even hear the difference between a 0.74 probability and a 0.75 probability), but there's no reason not to take advantage of that available precision.

Notice that this little program has been written in such a way that it can be tried out with the user interface objects number box, button, and toggle, but it was actually designed to be useful as a subpatch in some other patch, thanks to the inclusion of the inlet, outlet, and patcherargs objects. All of its input and output is actually expected to come from and go to someplace else in the parent patch, for use in a larger program. The number box clips any probability value to the 0-to-1 range, and the button converts any message in the left inlet to a bang for random. Save this patch with the name gamble, because it is used in the next example patch.

Now that we have a program that reliably makes a probabilistic decision, we'll use it to make a simple binary decision whether to play a note or not.

If the metro 100 object were connected directly to the counter 11 object, the patch would repeatedly count from 0 to 11, to cycle through a list of twelve pitch classes stored in the table, to play a loop at the rate of ten notes per second. However, as it is, the metro 100 object triggers the gamble object to make a probabilistic decision, 1 or 0. The select 1 object triggers a note only if the choice made by gamble is 1. If you click on the toggle to turn on the metro 100 object you will initially hear nothing because the probability of gamble choosing a 1 is set to 0. If you change the probability value to 1., gamble will always choose 1, and you will hear all the notes being played.

If the probability is set to some value in between 0 and 1, say 0.8, gamble will, on average, choose to play a note 80% of the time and choose to rest 20% of the time. The exact rhythm is unpredictable, but the average density of notes per second will be 8.

The upper right part of the patch randomly chooses a new probability somewhere in the range from 0 to 1 every ten seconds, and uses linear interpolation to arrive at the newly chosen probability value in five seconds. The probability will then stay at that new value for five seconds before the next value is chosen. By turning on this control portion of the patch, you can hear the effect of different statistical note densities in time, and the gradual transition from one density to another over five seconds.

Note that when gamble chooses 0, no note is played but the select 1 object still sends the 0 to the multislider (but not to the counter) so that the musical rest is shown in the graphic display, creating the proper depiction of the note density.

In this lesson we made a useful subpatch for making probabilistic decisions, and we used those decisions to choose whether to play a note or not. Of course the decision could be between any two things. For example you might use it to make control decisions, at a slower rate, choosing whether to run one part of the program or another, to control a longer-term formal structure.

Sunday, September 28, 2008

Moving range of random choices

The previous example used a very slow metronome to cause a change in the music every five seconds -- specifically to choose a new range of random pitch numbers for the note-playing algorithm. Because the change happens suddenly at a specific moment in time, the change in the musical texture is abrupt, creating a distinct new block of notes every five seconds.

Just as we used linear interpolation in earlier chapters to play a scale from one pitch to another or to create a fade-in of loudness or brightness, we could also use linear interpolation to cause more gradual change in the range of random numbers being used in a decision making process. Instead of changing the range and offset of the random numbers abruptly, we could just as well interpolate from the current range settings to the new range settings over a period of time.

This patch is in some ways very similar to the previous one, but the big difference here is the introduction of the line object to make a gradual transition to a new range of random numbers over time.

Instead of the new ranges going directly to the right inlet of the + object and the random object, they go to line objects, which send out a changing series of numbers that move to the new value over the course of five seconds. The pack objects are there to make well-formed messages for the line objects, telling line what transition time to use (5000 ms) and how often to send out an intermediate number (every 125 ms).

As in the previous example, the random 88 object chooses the offset from the bottom of the piano keyboard, which will have 21 added to it and be used as the minimum value for the random range. That same number also is subtracted from 88 to set the maximum possible size of the randomly chosen pitch range, so as not to exceed the total range of the piano; the range size that actually gets chosen is thus some random number up to that maximum.

This patch also differs from the previous one in that it varies the MIDI velocities of the notes as well as the pitches, to create some dynamic variation of loudness. To do this, the program chooses one of only six possible ranges (with the random 6 object), which one might think of as being analogous to the musical dynamic markings pp, p, mp, mf, f, and ff. Those random numbers 0 to 5 are multiplied by 20 and added to 8, so that the possible offsets for the velocity range are 8, 28, 48, 68, 88, and 108. The size of the velocity range is always 20 (as determinied by the random 20 object) so there will always be the same range of variety in the velocities, but the range itself will almost always be moving continuously up or down because of the changing offset.

The transition time of 5000 ms for all of the line objects performing the gradual range changes was chosen to be the same as the interval of the metro object that is triggering the changes. However, the transition time could in fact be chosen to be quicker, all the way down to 0 ms for an immediate change. The interval of change choices by the metro and the transition time for the line objects have been set equal in this example so that the changes are constant and completely gradual, but in fact the two things could be treated as independent (that is, changes could be sometimes sudden and other times gradual).

There are still many more ways in which the musical form and texture could be varied in this sort of simple random decision making algorithm. For example, the rate of the note-playing metro could be varied continuously in a manner similar to the way pitch and velocity are being varied here, which would cause acceleration and deceleration of the note rate. It would also be possible to randomly change the time interval of the choice-making metro in order to to vary the frequency with which new choices are made. Also, currently all note durations are the same as the inter-onset interval between notes; one can thus say that the ratio of duration to IOI is 1. However, an algorithm could include random variation of that ratio describing note duration as a factor of the inter-onset interval, to create either staccato notes (silence between notes) if the factor is less than 1, or overlapping notes if the factor is greater than 1. We'll see use of this legato factor in a future example. There could also be a part of the algorithm for making random decisions about presses of the piano's sustain pedal (MIDI controller number 64). Obviously musical composition usually involves much more than just decisions about pitch and loudness.

This example and the previous one have shown how changes in the range of possibilities can create interesting variation even with simple random decisions, and that the changes can be either abrupt or gradual for different effects. These principles of controlling the size and offset of crucial ranges are useful even when dealing with non-random decision making.

Friday, August 29, 2008

Fading by means of interpolation

To reinforce the importance of linear interpolation, let's look at a common technique used in music and video, the fade.

Fading a sound in from silence, or fading it out to silence, or fading an image in from darkness, or fading it out to darkness--these are commonly-occurring gradual transitions of the sort discussed in the previous lesson on linear change. To give the sense of gradual change between one value and another, which we abstractly called point A and point B in the previous lesson, we need for the program to calculate intermediate values to create a linear transition. In the previous example, the transition was a simultaneous linear change in pitch (MIDI key number, from low to high) and loudness (MIDI velocity, from high to low) of musical notes. Now we'll do the same thing with loudness of a sound and brightness of an image.

This program shows an automated fade-in of sound and image by means of linear interpolation.

For this program to work properly, it needs to be able to access a particular (very small) image file. You should download the file tinyocean.jpg and place it in the same folder as the program file.

Whereas the previous examples on linear interpolation used the objects clocker and expr to calculate linear progressions of values, in this example we take advantage of the Max object line, which does essentially the same thing but takes care of some of the math and programming tasks for you. The line object outputs a linear progression of values that arrive at a specified destination value in a specified amount of time. It receives in its inlet a destination value (the value at which it should eventually arrive), a transition time (how long it should take to get to the destination value), and a time interval (how often it should send out an intermediate value as it progresses toward the destination. It takes care of the scheduling of output and stops when it reaches its destination. [N.B. The line object has a slightly quirky way of calculating timing, which you should be aware of. You can read about it in the "line_timing_tricks" subpatch of the line object's help file.]

Some of the earlier lessons have used MSP audio, but this is the first one that uses Jitter visual display. (MSP object names all end with the ~ character and Jitter object names all begin with the jit. prefix. MSP objects calculate a constant stream of audio samples, and Jitter objects store and process and display multi-dimensional arrays of data (one of the most common of which is a two-dimensional array of color data to display a still image or a frame of video). For MSP audio to work, MSP must be turned on explicitly somewhere in the program; and for time-varying images to be displayed in Jitter, the data must be continually sent to a display wihdow, usually triggered by a metro set to a fast tempo (less than or equal to the desired frame rate). This is explained in the very first chapter of the MSP and Jitter tutorials that come with the Max program. That's what the toggle switch in the upper-right corner of the program does; it starts MSP audio via the dac~ object and starts a metro that triggers repeated recalculation and display of the image. The metro and dac~ objects both need to be turned on for the program to work.

When the audio and video are turned on, we don't hear or see anything because the amplitude of the sound is turned to 0 (the default value of the line~ object is 0) and the brightness of the image is set to 0 (with the @brightness 0 attribute typed into the jit.brcosa object. The image is automatically loaded into a Jitter matrix (the jit.matrix object) when the program is loaded. So everything is initialized correctly automatically when the program is opened. Once the audio and video has been turned on, MSP is calculating audio, but all the samples end up being 0 (because the *~ object is multiplying every sample by 0); and Jitter is calculating the video display at a rate of 25 fps, but all the pixels end up being black (because the jit.brcosa object is multiplying every pixel by 0). To fade the sound and image in, then, we just need to create a gradual progression of values that increase upward from those initial 0 values.

When you click on the button, it triggers two messages to the line object that will change the brightness of the image. The first message, set 0., resets line internally to that value. The second message, which comes immediately after, 1. 5001 40, says "Go toward 1, arriving there in 5001 milliseconds, sending out an interpolated value between 0 and 1 every 40 milliseconds." Since 40 ms is the same periodicity as the metro, every "frame" of the video display will have a slightly greater calculated brightness over the course of 5 seconds, till it reaches a full brightness of 1. (Every pixel in the image gets multiplied by the brightness factor each time.) At practically the same time, two messages are sent to the line object that will change the loudness of the sound. The first message resets line internally to a value of -78, and the immediately ensuing message tells line to go toward -18 over the course of 5 seconds, sending out an interpolated value every 40 ms. These values are treated as decibels of amplitude, and are converted into actual linear amplitude values by the dbtoa object. The line~ object is similar to line, in that it requires a destination value and a transition time; however, it does not require a time interval for reporting, because it interpolates constantly for every single audio sample (i.e., at the audio sampling rate).

The three message boxes in the yellow panel are for "clean-up": for resetting things when the audio and video are turned off by the toggle at the top of the program. The select 0 object detects when the toggle has been turned off, and triggers a stop message to both of the line objects, just in case they are in progress, a 0 message to reset the amplitude and brightness values to 0, and a bang to trigger the image once more (with the brightness of 0, thus displaying a black screen). This clean-up makes sure everything truly is (and looks, and sounds) turned off.

Both in the case of brightness and the case of loudness (amplitude), multiplying by some number scales the value; multiplying by 0 reduces everything to 0, multiplying by some number greater than 0 but less than 1 reduces it to less than its normal amount, multiplying it by 1 leaves it alone, and multiplying it by a number greater than 1 will increase it.

There is an aspect of our perception of audio that complicates matters a bit, which is that we actually perceive loudness and pitch subjectively as the logarithm of relative amplitude and frequency. That is to say, our perception of sound intensity and musical pitch requires that the amplitude and frequency change in a geometric progression (i.e. multiplicatively) in order for us to perceive the change as an arithmetic progression (i.e. additively). We'll discuss this more in future chapters. For now, it's enough just to note that when we are concerned with the perceived loudness of a sound, it's often best to discuss its amplitude in terms of decibels (a logarithmic formula for characterizing relative amplitude). That's what the dbtoa object does for us; it allows us to make a linear progression stated in decibels and calculates an appropriate exponential increase in amplitude.

Just to illustrate, -78 decibels is equal to an amplitude of 0.000126, and -18 decibels is equal to an amplitude of 0.126. So in our fade-in of the noise in this example, the amplitude increases by a factor of 1000 (but by an additive difference of 0.125874). If we make a linear progression in decibels from -78 to -18, when we're halfway there we'll be at -48 dB, which is an amplitude of 0.004. Although that's much less than halfway to 0.126 arithmetically, 0.004 has the same ratio relationship to 0.000126 as 0.126 has to 0.004, so we're halfway there in terms of perceived loudness. If, on the other hand, we made a straight linear progression in amplitude from 0.000126 to 0.126, at the halfway point we would be at 0.0628, which is half the amplitude we are heading for, but is 500 times the amplitude at which we started! In other words, over the first half of the period of time, the amplitude would increase by a factor of 500, but for the second half of the period of time it would increase by a factor of only 2. We would hear a very substantial change in the first half, and very little change in the second half. The decibel scale corresponds much more closely to our subjective perception of a sound's intensity, and thus gives us a fade-in that feels more linear to us.

And finally, an aesthetic observation about the relationship of sound and image. The white noise (total randomness of audio samples) produced by the noise~ object is related to the noise made by the turbulence of ocean waves. Its not exactly the same, of course, but when the sound and the image are juxtaposed, and are further linked by the way they fade simultaneously, the visual image may influence us to feel that the noise is somehow ocean-like (even though it was not produced by an ocean) and the sound may serve to enhance the evocative appeal of the image. When two things are juxtaposed, our tendency is to think about their relationship, which sometimes includes finding relationships that may not in fact be there.

Tuesday, August 26, 2008

Linear change

"The shortest distance between two points is a straight line." That mathematical truism recalled from our geometry class is often quoted as a sort of proverb, reminding us that the most direct way to get somewhere is usually to head straight for it. The straight line stands for directness, and we can easily perceive its direction and predict where it will arrive if it continues. Thus, straight lines are useful for describing gradual-yet-direct change. If we draw a straight line from point A to point B, we're getting there by the most direct route, and we're also drawing every intermediate point that lies directly between A and B.

In the previous lesson, we provided the formula y=mx+b as a general way to describe any line. That's how it's given in arithmetic textbooks, but we'll need to rephrase it a bit for our purposes. For starters, how do we calculate the slope m? Well, for any two points--we'll call them A and B--we can calculate the slope of the line that runs through them (and thus the line segment that connects them) by dividing the vertical distance between them by the horizontal distance between them; that is to say by dividing the difference in their y values by the difference in their x values. So if the coordinates of point A are (xa, ya) and the coordinates of point B are (xb, yb), then the slope m is equal to (yb-ya)/(xb-xa). To refer to the last example from the previous lesson, the first point on the line segment is (0,36) and the last point is (100,96). so the slope of the line is (96-36)/(100-0), which is 60/100, which is 0.6. Thus, by knowing the slope m (0.6) and the offset b (36), we can calculate what y value will lie on the line for any x value we put into the formula.

In practical terms, for the purpose of programming linear change in sound, music, video, animation, etc., we'll need to know those values, or at least we'll need to be able to calculate them. (In the above example, we were able to calculate the slope because we knew the starting and ending x and y values.) Then, by starting at the desired value for x (a starting point in time) and proceeding to a desired destination value for x (a future point in time), we can calculate the values for y for as many intermediate x points as we want, to give the impression of a linear change in y. Before we look at an example, let's consider two other terms that are commonly used in digital media arts, which have direct relevance to this definition of a line.

-----

The term linear interpolation between two points A and B means finding an appropriate intermediate point (or points) that would exist if there were a straight line between A and B. To continue with the example we've been using, if we have the two points (0,36) and (100, 96), we can interpolate one or more points between them by calculating the y value at a hypothetical x value between 0 and 100. For example, just by using the formula and the known values for slope and offset, we can calculate that when x equals 20 y will equal 48, and when x equals 80 y will equal 84.
(0.6)20+36=48
(0.6)80+36=84
So, for any point between A and B, we can interpolate one or more additional points that will lie on a straight line segment between them. Another way to say this is that for any intermediate x value, we can find the appropriate corresponding y value.

Tangent: There's another way to think of linear interpolation, which is to think of "how far along" the intermediate x is, on a path from point A to point B. In other words, for the hypothetical x value, how far is it from its starting point xa, and how far is it from its destination point xb? We can actually calculate it as a fraction between 0 and 1 by calculating its distance from xa, which will be x-xa, and dividing that by the total distance between xa and xb, which will be xb-xa; so the fraction of the distance that a hypothetical x is on the path from xa to xb can be calculated with the expression (x-xa)/(xb-xa). So another way to think about calculating the y value for a hypothetical x value, is to multiply the destination y (yb) by that fraction because we're that fraction of the way there, and multiply the starting y (ya) by 1 minus that fraction. The equation for finding the y value that corresponds with a hypothetical x value is thus y=(yb(x-xa)/(xb-xa))+(ya(1-((x-xa)/(xb-xa))). Or, to put it a bit more simply, y = ((x-xa)(yb-ya))/(xb-xa)+yb. That's a valid formula for linear interpolation, or linear mapping of x to y.

This process of linear interpolation is thus very closely related to another term, linear mapping, which means making a direct correlation of an x value to its corresponding y value. If we have a given range of x values (say, from xa to xb), and a corresponding range of y values (say, from ya to yb), then for any value of x we can calculate the linearly corresponding y value. This is called mapping x values to y values. (In theory, there could be a wide variety of curved or non-linear "maps" by which we make these correlations between x values and y values, however, for now we'll stick to linear mapping.) The formula for linear mapping, if you needed to program it yourself, is shown in the preceding paragraph. Mercifully, Max provides many objects that can calculate mapping for you, such as zmap, scale, etc.), but it's important to understand mapping conceptually since it is key to all sorts of programming in digital media.

-----

Now let's look at a simple example of linear change. This program plays musical notes every 80 milliseconds (i.e., at the rate of 12.5 notes per second) for 2 seconds (2000 milliseconds). Over the course of those two seconds, the pitch of the notes changes linearly from MIDI 36 (low C) to MIDI 96 (high C) and the MIDI velocity (loudness) of the notes changes linearly from 124 (fortissimo) to 32 (piano). The result is a program that plays an ascending pentatonic scale, with a diminuendo.

Note that the program plays a major pentatonic scale, which is not a strictly linear configuration. (Some steps are 2 semitones, and some are 3 semitones.) Because we're stepping through five octaves of pitch in 25 increments, each step should be 1/5 of an octave. However, because MIDI does not allow for fractional values, the decimal part of the intermediary pitch values is truncated (chopped off), so ultimately not all the steps are exactly the same size. The fortuitous result of this truncation is the pattern that gives a major pentatonic scale. There's no magical mathematical relationship between truncation and this particular pitch pattern. It's just a fortunate (calculated) coincidence of the range, the number of notes being played within it, MIDI representation of pitch, and truncation of the fractional part of the number.

Let's look at a couple of details in the program. The clocker object reports at regular intervals the amount of time elapsed since it was turned on. This is similar to the timed counting demonstrated with the metro and counter objects, but in this case it is counting in increments of a certain number of milliseconds, corresponding exactly to the amount of time that has passed. This is handy because it allows us to check each report to see if a certain amount of time has passed--in this case 2000 milliseconds--and do something at that time (in this case, stop the process). Since we know the desired stopping time (that is to say, the destination x value) we can also use each reported time to calcuate how far along we are to the destination time, and use that fraction for linear mapping of time to the pitch and velocity values.

We can divide the elapsed time (x) by the total time (2000) to get a fraction between 0 and 1. We then multiply that by the range of the desired y values (yb-ya), and add the desired offset (ya) to it. (We calculated the range of desired y values by subtracting the starting value from the ending value, i.e., yb-ya.) That's what's happening in the two expr objects. In the case of pitch, ya equals 36 and yb equals 96 so the range is 96-36, which is 60. In the case of velocity, ya equals 124 and yb equals 32 so the range is 32-124, which is -92. That's where the range values 60 and -92 come from in the expr objects. The numbers 36 and 124 in the expr objects are the ya (offset) values. The number boxes (because they are integer number boxes) drop the fractional part of the output from the exprs.

This direct use of the y=mx+b formula inside an expr object is just one way to do linear mapping in Max. The line object does automated linear interpolation, kind of like this combination of clocker and expr shown here, and other objects such as zmap and scale calculate mapping of x and y values if you provide the points A and B.

Linear motion, linear change, linear interpolation, and linear mapping are frequently used in composition and digital media.

Saturday, August 23, 2008

That's Why They Call Them Digital Media

The foregoing examples demonstrate that musical notes, colors, locations--and indeed anything else--can be described numerically. That's the essence of computational creativity, and the essence of digital intermedia/multimedia. Many of these lessons will be about exploring that way of systematically--and numerically--describing aesthetics and composition.

So far we've used numbers to describe and compare amounts of time and rates of speed, to count and enumerate, to index orders of events, and to indicate musical pitch, loudness, intensity of color, and position in a two-dimensional coordinate system. Composition and aesthetic appreciation are about the creation and recognition of patterns--sonic, temporal, visual, and rhetorical. Thus, to the extent that we can program a computer to generate interesting numerical patterns, we are on the path to computer-generated art.

Mathematical functions with two variables (such as x and y) can be graphed in two dimensions as a curve. If we consider one of the variables, x, to be the linear progression of time, then the shape of the curve displays the change in the other variable, y, over time. Almost any function with two variables is potentially useful for algorithmic composition of time-based media art (music, video, animation, etc.), if we just consider one variable to be the passage of time and the other variable to describe some characteristic of the art.

Let's look at some examples. Remember x will stand for the passage of time, and y will stand for the value of something else. Time is expressed in some arbitrary units; you can think of the units as seconds for now. Likewise y is some arbitrary thing we're measuring; for the sake of having both a sonic and a visual example, let's think of y as standing for the volume (loudness) of a tone or the brightness of a color.

What does this formula tell us?
y = 10
Well, there's no x in that formula. That means that y is independent of x. No matter what x is, the value of y will always be 10. This describes a constant value, not a variable one. y stays constant over time. If this were loudness or brightness, we would perceive no change.

Now how about this formula?
y = x
This means that y increases as time progresses. Since time progresses linearly, y will increase linearly. If we call the start of the time under consideration "time 0", meaning x = 0, then y will be 0 at that time. At time 10, y will be 10, and so on. So over the course of time going from 0 to 10, y will increase linearly from 0 to 10. The units we're using are arbitrary, but imagine that over the course of 10 seconds the volume of a sound fades in from 0 to 10, and a color fades from total darkness to a brightness of 10. As soon as we get to real examples, we'll have to concern ourselves with what the units actually are and what they mean, but for now we're just talking abstractly.

Here's a formula that gives a simple curve.
y = sin(2πx)
This causes y to vary up and down in a smooth, sinusoidal fashion. Of course, all of these formulae are rather oversimplified; we'd need to provide more numerical information to make x and y have the right values for the particular usage to which we're applying them. But these simple examples show that we can generate curves by tracking any two variables, and that by increasing x linearly to stand for the passage of time we can use the resulting y to control some aspect of digital media

We can add other constant numbers or variables into the formula to make adjustments that bring the y value into a desired range. Those additional constants or variables will usually do one of two things: either scale the x or y value by multiplication (multiply it by some amount to scale it to a larger or smaller range) or offset the x or y value by addition (add some amount to it to push it into a different region).

For example, a more complete and useful formula for generating a smooth sinusoidal change would look like this.
y = Asin(2πƒt+ø)+d
where A is the amplitude of the sinusoid, ƒ is its frequency, t is time, ø is its phase offset, and d is its DC offset. The variable t in this case is like x in the previous examples. ƒ scales the sinusoid on the horizontal axis, and A scales it on the vertical axis; ø allows us to offset the sinusoid on the horizontal axis, and d will offset it on the vertical axis. We'll use some version of this formula often in synthesizing and processing sound in future lessons.

Here's a simpler equation, that permits us to make any desired linear variation in y.
y = mx+b
where m is the slope (steepness of the angle) of the change in y, and b is the y intercept (the y value at which the line intersects the y axis), which we can call the (vertical) offset. Here's what that would look like with m = 0.6 and b = 36.

Since we will be starting things at time 0, and will be mostly dealing with non-negative y values for MIDI, color, coordinates, etc., it sometimes makes sense just to think about the upper-right quadrant of the graph, showing y values as x (time) increases from 0.

In the next lesson we'll put the y=mx+b equation into practice in a program.

Algorithmic Composition

Blog Archive

About Me