Algorithmic Composition: random

Showing posts with label random. Show all posts

Monday, October 6, 2008

Probability distribution

In a set of possible events, each element of the set can and does have a distinct probability of occurring. We've seen how to write a program that ascribes probabilities to two possibilities. It's not much more difficult to make a probability vector -- an array of probabilities corresponding to each of the elements in a set of multiple possibilities. Once we have established this probability vector, we can use random numbers to read from that probability distribution, and over a sufficiently large sample of choices the events will occur with approximately the likelihoods that have been assigned to them.

This is fairly straightforward to implement as a computer program, and the process for choosing from a discrete probability distribution of multiple possibilities is essentially the same as choosing from a set of two possibilities. If we know the sum of the probabilities, we can in effect divide that range into multiple smaller ranges, the sizes of which correspond to the probability for each one of the possibilities. We can then choose a random number less than the sum, and check to see in which sub-range it falls. The process is something like this:

1. Construct a probability vector.

2. Calculate the sum of all probabilities.

3. Choose a random (nonnegative) number less than the sum.

4. Begin cumulatively adding individual probability values, checking after each addition to see if it has resulted in a value greater than the randomly chosen number.

5. When the randomly chosen value has been exceeded, choose the event that corresponds to the most recently added probability.

Here's an example. If we have six possible events {a, b, c, d, e, f} with corresponding probabilities {0., 0.15, 0., 0.25, 0.5, 0.1} and we choose a nonnegative random number less than their sum (the sum of those probabilities is 1.0) -- let's say it's 0.62 -- we then begin cumulatively adding up the probability values in the vector till we get a number greater than 0.62. Is 0. greater than 0.62? No. Is 0.+0.15=0.15 greater than 0.62? No. Is 0.15+0.=0.15 greater than 0.62? No. Is 0.15+0.25=0.4 greater than 0.62? No. Is 0.4+0.5=0.9 greater than 0.62? Yes. So we choose the event that corresponds to that last probability value: event e. It is clear that by this method events a and c can never be chosen. Random numbers less than 0.15 will result in b being chosen, random numbers less than 0.4 but not less than 0.15 will result in d being chosen, random numbers less than 0.9 but not less than 0.4 will result in e being chosen, and random numbers less than 1.0 but not less than 0.9 will result in f being chosen. In short, the likelihood of each event being chosen corresponds to the probability assigned to it.

Max has an object designed for entering a probability vector and using it to make this sort of probabilistic decision. Interestingly, it is the same object we've been using for storing other sorts of arrays: the table object. When the table object receives a bang in its left inlet, it treats its stored values as a probability vector (instead of as a lookup array), uses that vector to make a probabilistic choice, and sends out the index (not the value itself) that corresponds to the choice, as determined by the process described above.

Note that this is fundamentally different from the use of table described in an earlier lesson, to look up values in an array. It's also fundamentally different from randomly choosing one of the values in an array by choosing a random index number. In this case, we're using the index numbers in the table (i.e., the numbers on the x axis) to denote different possible events, and the values stored in the table (i.e. the numbers on the y axis) are the relative probabilities of each event being chosen. A bang message received by the table object tells it to enact this behavior.

Note also that the probability values in the table don't need to add up to 1.0. In fact, that would be completely impractical since table can only hold integer values, not fractional ones. The probabilities can be described according to any desired scale of (nonnegative) whole numbers, and can add up to anything. The table object just uses their sum (as described in step 2 of the process above) to limit its choice of random numbers.

This program demonstrates the use of probability distributions to choose from among six possible pitches and six possible colors, with different likelihoods.

The table labeled "probabilities" stores a probability distribution. Its contents can set be set to one of seven predetermined distributions stored in the message boxes labeled "probabilities", or you can draw some other probability distribution in the table's graphic editing window. (The predetermined probabilities have all been chosen so that they add up to 100, so that the values can be thought of as percentages, but they really are meaningful relative to each other, and don't have to add up to 100.) The metro object sends bang messages to the table at a rate of 12.5 per second (once every 80 milliseconds) to make a probabilistic choice. The table object responds by sending out an index number from 0 to 5 each time based on the stored probabilities.

Those numbers are in turn treated as indices to look up the desired color and pitch events. The colors are stored in a coll object and the pitch classes are stored in another table object. This illustrates two different uses of table objects; one is used as a probability vector, and the other is used as a lookup array. The pitch choices are just stored as pitch classes 2 6 9 1 4 7 (D F# A C# E G), and those are added to the constant number 60 to transpose them into the middle octave of the piano. The color choices are stored as RGB values representing Red Magenta Blue Cyan Green Yellow, and those are drawn as vertical colored lines moving progressively from left to right. In this way one sees the distribution of probabilistic decisions as a field of colored lines, and one hears it as a sort of harmonic sonority.

The metro object, in addition to triggering a probabilistic choice in the table object, triggers the counter object to send out a number progressing from 0 to 99 indicating the horizontal offset of the colored line. That number is packed together with the color information from the coll, for use in a linesegment drawing instruction for the lcd.

Now that we've seen an explanation of discrete probability distribution, and seen how it can be implemented in a program, and seen a very simple example of how it can be applied, let's make some crucial observations about this method of decision making.

1) This technique allows us to describe a statistical distribution that characterizes a body of choices, but each individual choice is still arbitrary within those constrictions.

2) The choices are not only arbitrarily made, they produce abstract events (index numbers) that could potentially refer to anything. The actual pitch and color event possibilities were chosen carefully by the programmer to create specific sets of distinct possibilities, and the probability distributions were designed to highlight certain relationships inherent in those sets. Theoretically, though, the method of selection and the content are independent; choices are made to fulfill a statistical imperative, potentially with no regard to the eventual content of the events that those numbers will trigger.

3) Each individual choice is made ignorant of what has come before, thus there is no control over the transition from one choice to the next, thus there is no controlled sense of melody or contour in the pitch choices (other than the constraints imposed by the limited number of possibilities), nor pattern to the juxtaposition of colors. This limitation can be addressed by using a matrix of transition probabilities, known as a Markov chain, which will be demonstrated in another lesson.

4) The transitions from one probability distribution to another are all sudden rather than nuanced or gradual. This can be addressed by interpolating between distributions, which will also be demonstrated in another lesson.

5) Decision making in this example, as in most of the previous examples, is applied to only one parameter -- color in the visual domain and pitch class in the musical domain. Obviously a more interesting aesthetic result can be achieved by varying a greater number of parameters, either systematically or probabilistically. Synchronous decision making applied to many parameters at once can lead to interesting musical and visual results. This, too, is a topic for a future lesson.

Monday, September 29, 2008

A simple probabilistic decision

In all of the examples of autonomous computer decision making presented up to this point, we've used equal probabilities for all the possible choices, using the random object or its audio counterpart noise~. The resultant choices have thus been truly arbitrary, within the limits prescribed by the program.

It's also possible to use random number generation to enact decisions that are still arbitrary but are somewhat more predictable than plain randomness. One can apply a probability factor to make one choice more likely than another in a binary decision, or (as will be demonstrated in a future lesson) one can apply a more complicated probability function to a set of possibilities, giving each one a different likelihood. This lesson will demonstrate the first case, using a probability factor to make a binary decision in which one result is more likely to occur than the other.

As described briefly in an earlier lesson on randomness, the probability of a particular looked-for event occurring can be defined as a number between 0 and 1 inclusive, with that number being the ratio of the number of looked-for outcomes divided by the number of all possible outcomes. For example, the probability of choosing the the ace of spades (1 unique looked-for result) out of all possible cards in a deck (52 of them) is 1/52, which is 0.019231. The probability of not choosing the ace of spades is 1 minus that, which is 0.980769 (51/52). Thus we can say that the likelihood of choosing the ace of spades at random from a deck of cards is less than 2%, and the likelihood of not choosing it is a bit more than 98%; other ways of stating this are to say that there is a 1 in 52 chance of getting the ace of spades, or to say that the odds against choosing the ace of spades are 51 to 1.

For making an automated decision between two things, we can emulate this sort of odds by applying a probability factor between 0 and 1 to one of the choices. For example, if we want to make a decision between A and B, with both being equally likely, we would set the probability of A to 0.5 (and thus the probability of B would implicitly be 1 minus 0.5, which also equals 0.5). If we want to make A somewhat more likely than B, we could set the probability of A to something greater than 0.5, such as 0.75. This would mean there is a 75% chance of choosing A, and a 25% chance of choosing B; the odds in favor of choosing A are 3 to 1 (75:25). This does not ensure that A will always be chosen exactly thrice as many times as B. It does mean, however, that as the number of choices increases, statistically the percentage of A choices will tend toward being three times that of B.

The simplest way to do this in a computer program is as follows: Set the probability P for choice A. Choose a random number x between 0 and 1 (more specifically, 0 to just less than 1). If x is less than P, then choose A; otherwise, choose B. The result of such a program will be that over the course of numerous choices, the distribution of A choices over the total number of choices will tend toward P. The choices will still be arbitrary, and we can't predict any individual choice with certainty (unless the probability of A is either 0 or 1), but we can characterize the statistical probability of choosing A or B.

Because the random generator in Max produces whole numbers less than the specified maximum rather than fractional numbers between 0 and 1, we have to do one additional step: we either have to map the range of random numbers into the fractional 0-to-1 range, or we have to map the probability factor into the range of whole numbers. It turns out to be slightly more efficient to do the latter, because it requires doing just one multiplication when we specify the probability, rather than doing a division every time we generate a random number. The following tiny program does just that. Every time it receives a message in its inlet, it will make a probabilistic choice between two possible results, based on a provided probability factor.

The probability value that goes into the number box (either by entering the number directly or by it coming from the right inlet or the patcherargs obejct) gets multiplied by 1,000,000 and the result is stored in the right inlet of the less than object. The bang from the button (triggered either by a mouse click or by a message coming in the left inlet) causes random to generate a random number from 0 to 999,999. If it is less than the number that came from the probability factor (which would be 450,000 in the above example), it sends out a 1, otherwise it sends out a 0. You can see that statistically, over the course of many repetitions, it will tend to send out 1 about 45% of the time.

A useful programming trick that is invisible in this picture is that the number box has been set to have a minimum value of 0 and a maximum value of 1. Any value that comes in its inlet will be clipped to that range before being sent out, so the number box actually serves to limit the range and prevent any inappropriate probability values that might cause the program to malfunction. Protecting against expected or unwanted values, either from a user or from another part of the program, is good safe programming practice.

As was mentioned earlier, the multiplication of the probability by 1,000,000 is because we want to express the probability as a fraction from 0 to 1, but random only generates random whole numbers, so we need to reconcile those two ranges. We chose the number 1,000,000 because that means that we can express the probability to as many as six decimal places, and when we multiply it by 1,000,000 the result will always be a unique integer representing one of those 1,000,001 possible values (from 0.000000 to 1.000000 inclusive). Since six decimal places is the greatest precision that can be entered into a standard Max object, this program takes full advantage of the precision available in Max. It's inconceivable that you could create a situation in which you could ever hear the difference between a 0.749999 probability and a 0.750000 probability (indeed, in most musical situations it's doubtful you could even hear the difference between a 0.74 probability and a 0.75 probability), but there's no reason not to take advantage of that available precision.

Notice that this little program has been written in such a way that it can be tried out with the user interface objects number box, button, and toggle, but it was actually designed to be useful as a subpatch in some other patch, thanks to the inclusion of the inlet, outlet, and patcherargs objects. All of its input and output is actually expected to come from and go to someplace else in the parent patch, for use in a larger program. The number box clips any probability value to the 0-to-1 range, and the button converts any message in the left inlet to a bang for random. Save this patch with the name gamble, because it is used in the next example patch.

Now that we have a program that reliably makes a probabilistic decision, we'll use it to make a simple binary decision whether to play a note or not.

If the metro 100 object were connected directly to the counter 11 object, the patch would repeatedly count from 0 to 11, to cycle through a list of twelve pitch classes stored in the table, to play a loop at the rate of ten notes per second. However, as it is, the metro 100 object triggers the gamble object to make a probabilistic decision, 1 or 0. The select 1 object triggers a note only if the choice made by gamble is 1. If you click on the toggle to turn on the metro 100 object you will initially hear nothing because the probability of gamble choosing a 1 is set to 0. If you change the probability value to 1., gamble will always choose 1, and you will hear all the notes being played.

If the probability is set to some value in between 0 and 1, say 0.8, gamble will, on average, choose to play a note 80% of the time and choose to rest 20% of the time. The exact rhythm is unpredictable, but the average density of notes per second will be 8.

The upper right part of the patch randomly chooses a new probability somewhere in the range from 0 to 1 every ten seconds, and uses linear interpolation to arrive at the newly chosen probability value in five seconds. The probability will then stay at that new value for five seconds before the next value is chosen. By turning on this control portion of the patch, you can hear the effect of different statistical note densities in time, and the gradual transition from one density to another over five seconds.

Note that when gamble chooses 0, no note is played but the select 1 object still sends the 0 to the multislider (but not to the counter) so that the musical rest is shown in the graphic display, creating the proper depiction of the note density.

In this lesson we made a useful subpatch for making probabilistic decisions, and we used those decisions to choose whether to play a note or not. Of course the decision could be between any two things. For example you might use it to make control decisions, at a slower rate, choosing whether to run one part of the program or another, to control a longer-term formal structure.

Sunday, September 28, 2008

Moving range of random choices

The previous example used a very slow metronome to cause a change in the music every five seconds -- specifically to choose a new range of random pitch numbers for the note-playing algorithm. Because the change happens suddenly at a specific moment in time, the change in the musical texture is abrupt, creating a distinct new block of notes every five seconds.

Just as we used linear interpolation in earlier chapters to play a scale from one pitch to another or to create a fade-in of loudness or brightness, we could also use linear interpolation to cause more gradual change in the range of random numbers being used in a decision making process. Instead of changing the range and offset of the random numbers abruptly, we could just as well interpolate from the current range settings to the new range settings over a period of time.

This patch is in some ways very similar to the previous one, but the big difference here is the introduction of the line object to make a gradual transition to a new range of random numbers over time.

Instead of the new ranges going directly to the right inlet of the + object and the random object, they go to line objects, which send out a changing series of numbers that move to the new value over the course of five seconds. The pack objects are there to make well-formed messages for the line objects, telling line what transition time to use (5000 ms) and how often to send out an intermediate number (every 125 ms).

As in the previous example, the random 88 object chooses the offset from the bottom of the piano keyboard, which will have 21 added to it and be used as the minimum value for the random range. That same number also is subtracted from 88 to set the maximum possible size of the randomly chosen pitch range, so as not to exceed the total range of the piano; the range size that actually gets chosen is thus some random number up to that maximum.

This patch also differs from the previous one in that it varies the MIDI velocities of the notes as well as the pitches, to create some dynamic variation of loudness. To do this, the program chooses one of only six possible ranges (with the random 6 object), which one might think of as being analogous to the musical dynamic markings pp, p, mp, mf, f, and ff. Those random numbers 0 to 5 are multiplied by 20 and added to 8, so that the possible offsets for the velocity range are 8, 28, 48, 68, 88, and 108. The size of the velocity range is always 20 (as determinied by the random 20 object) so there will always be the same range of variety in the velocities, but the range itself will almost always be moving continuously up or down because of the changing offset.

The transition time of 5000 ms for all of the line objects performing the gradual range changes was chosen to be the same as the interval of the metro object that is triggering the changes. However, the transition time could in fact be chosen to be quicker, all the way down to 0 ms for an immediate change. The interval of change choices by the metro and the transition time for the line objects have been set equal in this example so that the changes are constant and completely gradual, but in fact the two things could be treated as independent (that is, changes could be sometimes sudden and other times gradual).

There are still many more ways in which the musical form and texture could be varied in this sort of simple random decision making algorithm. For example, the rate of the note-playing metro could be varied continuously in a manner similar to the way pitch and velocity are being varied here, which would cause acceleration and deceleration of the note rate. It would also be possible to randomly change the time interval of the choice-making metro in order to to vary the frequency with which new choices are made. Also, currently all note durations are the same as the inter-onset interval between notes; one can thus say that the ratio of duration to IOI is 1. However, an algorithm could include random variation of that ratio describing note duration as a factor of the inter-onset interval, to create either staccato notes (silence between notes) if the factor is less than 1, or overlapping notes if the factor is greater than 1. We'll see use of this legato factor in a future example. There could also be a part of the algorithm for making random decisions about presses of the piano's sustain pedal (MIDI controller number 64). Obviously musical composition usually involves much more than just decisions about pitch and loudness.

This example and the previous one have shown how changes in the range of possibilities can create interesting variation even with simple random decisions, and that the changes can be either abrupt or gradual for different effects. These principles of controlling the size and offset of crucial ranges are useful even when dealing with non-random decision making.

Sunday, September 14, 2008

Limiting the range of random choices

The previous examples showed random selection from a body of possible choices. In the first example of randomness the selection was from among events that had a certain distinctive character (chords, paintings, drum sounds), such that the events were fairly engaging in their own right and the order in which they occurred was not terribly crucial. In the next example, showing the relationship of randomness and noise, the individual events were neutral and and relatively characterless (individual samples in an audio signal, or pitches in a steady stream of notes), so the result was maximally patternless and colorless because all possibilities could occur with equal likelihood. In the case of random audio samples we get white noise (confirming the description as "colorless"), and in the case of random pitches we get complete atonality.

Even with such random decision making, however, we can still exert some control to shape the random selections in various ways. One way is simply to limit ourselves to a subset of the full range of possibilities. The subset will in some way be distinct from the full set, and that will allow us to characterize it by the way(s) in which it differs from the full set. For example, instead of choosing randomly from among all the keys of the piano--MIDI notes 21 to 108 as in the previous example-- we could decide to choose from a smaller number of notes. If the subset we choose is significantly different from the full set, we'll recognize the difference. (In effect, we'll recognize the absence of certain possibilties.) For example, if we choose only from among the pitches 59 to 70, we could characterize those pitches as "middle range" because we would observe that the choices contained no very low or very high notes. Choices made in this way would be noticeably different from choices made in the range 21 to 32 (extremely low), or 97 to 108 (extremely high), or 95 to 96 (very high and extremely small range), or 21 to 66 (low but very wide range), and so on. In this way we can get variety and distinction between different subsets of possible choices, even though the decision making method within the specified range would still be completely random.

This example program demonstrates that idea. One can achieve variety of randomness by limiting the range of choices.

The note-playing portion of the program is at the bottom. The random object near the bottom of the program chooses from within a certain range, usually much smaller than the total possible range of 88 keys, so it has a limited number of possible notes to choose. The choices from within that range are then offset to a particular position on the keyboard by adding a certain amount to each note. (The idea of offsetting a range has been discussed in earlier chapters.) So, for example if the right argument of the random object is 6 (numbers from 0 to 5) and the right argument of the + object (the offset of the random range) is 84, then the possible pitches are 84 to 89 only. The note-playing metro object is set to a time interval of 62.5 milliseconds, which is the period corresponding to 16 notes per second. The pitch choices are also displayed in a multislider object set to display in Point Scroll style.

One useful way to think about a range of numbers is by its minimum and maximum (84 and 89 in the above example). Another useful way to think of it is by its size (calculated as the difference between the maximum and the minimum) and its offset (its minimum). The rangeslider object in the middle of this patch allows the user to specify a minimum and a maximum value, either by clicking and dragging with the mouse or by sending numbers in the inlets. This rangeslider has been set to have a built-in offset of 21 (specified as its Minimum Value in the Inspector); that means that 21 is the lowest value one can select with the mouse, and any number that comes in its inlets will get 21 added to it. The rangeslider's size is 88 (specified as its Number of Steps in the Inspector), so its total range corresponds to the pitches of a piano keyboard. Thus, the user -- or another part of the program -- can choose a range minimum and maximum between 21 and 108 in the rangeslider, and those numbers get sent out the outlets of rangeslider. The minimum is used as an offset value for the + object, and the argument for the random object is calculated by subtracting the minimum from the maximum (and adding 1 to it since random chooses numbers from 0 to one less than its argument).

The upper part of the program uses a slower metro to automatically trigger new randomly-chosen minimum and maximum values for the rangeslider every 5 seconds. The random 88 object chooses one of the 88 possible keys to use as an offset from the bottom of the piano. (The rangeslider will add 21 to that to calculate its minimum value.) Once the offset has been chosen, that determines the limit to how large the range can be without extending beyond the top range of the piano, so the maximum is calculated by taking 88 minus the offset, choosing a random number less than that, and adding it back to the offset. (Once again, the rangeslider will add 21 to that to calculate its maximum value.)

This is a demonstration of two very different time intervals being used (5000 ms and 62.5 ms) to specify periodicity at two different formal levels of the music. The fast metro determines the note rate, and the slow metro determines the rate of change of the random range. Because of the mathematical relationship of the two rates, the range changes every 80 notes. You can think of the slower metronome as determining a control structure for the faster one; it provides information that controls parameters that limit the note choices in specific ways. We don't hear the results of the slow (control) metro directly, but we do hear the result of its actions by the limits that it sets for the note choices. Rhythmic interest in temporal formal structure often requires different levels of periodicity like this. This example has two levels of periodicity, but there's no reason there couldn't be more. The different levels can be synchronized or unrelated, but they will generally be separate parts of the program that communicate with each other. Using one part of the program as a control structure to influence another part is a very effective way to define different formal levels of decision making or change.

Saturday, September 6, 2008

Randomness and noise

There's a close relationship, both philosophical and mathematical, between randomness and noise.

Mathematically, randomness is characterized by a complete lack of discernible pattern, and equal probability of all possibilities. When we choose numbers at random at the audio sampling rate and listen to the resulting signal, it sounds to us like incomprehensible static: white noise. A spectral measurement of that sound would show essentially equal power at all frequencies. That is why it's called "white" noise; like white light, it contains all perceptible frequencies equally.

Cognitively, things that seem to us to have no pattern, organization, or reason for being as they are are often incomprehensible to us. When sound seems to us to have no pattern or organization--that is, when we find it incomprehensible--we might find it uninteresting or even irritating. Sound that is deemed irritating or unwanted by someone is often termed noise. However, it might be that there is in fact a pattern or organization, but it is simply too complicated for a particular listener to understand, and thus would be considered noise by that person.

When something strikes you as incomprehensible noise, consider the possibility that it may actually have a very interesting underlying organization, but that that organization is simply unknown to you or is too complex for you to understand given your current knowledge. That can be an encouraging thought, because it means that with the proper education you can understand it and perhaps then appreciate it better.

Composer and philosopher John Cage, in his discourse "The Future of Music: Credo" in Silence: Lectures and Writings, says "Wherever we are, what we hear is mostly noise. When we ignore it, it disturbs us. When we listen to it, we find it fascinating." In fact, a pretty good working definition of noise, from a philosophical and cognitive standpoint, might be "unwanted sound", or more generally, to take it beyond the realm of only sound, "unwanted stuff". This implies that if one can get rid of expectation or desire of what sounds we "want" or what sound "should" be, then unwanted sound, irritating and annoying sound, will cease to exist.

There is another sense in which the word noise can be related to digital arts. In theoretical discourse about communication and information, the word noise is used in the sense of "extraneous content" introduced during transmission of a message (and thus usually unwanted). The noise in this sense is clearly defined as the difference between the information at its source (whence it is transmitted) and the information at its destination (where it is received). In the case of sound, we can take this word "difference" by its mathematical meaning; we can literally determine the noise by subtracting the source sound from the destination sound. This fact is used for noise reduction in sound transmission, such as in the use of balanced lines for analog audio, or in the use of digitized (PCM) audio for recording and telecommunications. It can also be used to evaluate the imprecision of a quantization process. For example, the "quantization noise" introduced by digitizing sound is the difference between the analog sound at the input and its digitized representation used to produce the output. That noise is generally white noise, caused by random imprecision in the measurement and digitization of every single sound sample.

Sounds that lack pattern--either at the microscopic level of the sound wave itself or at the more macroscopic level of the organizations of different sounds--are characterized as noise. In that sense, all sounds can be situated on a continuum from totally simple (such as a sine tone) or predictable (a regularly repeating sound) to totally complex (to the point of incomprehensibility) and unpredictable (white noise, or random organizations of sounds). Sounds that are characterized as noisy in this way, such as drums and cymbals, which are situated on the noisy end of the continuum, are still musically useful and desirable, and sounds that are in the middle of the continuum, such as a breathy flute note which has both coherent sinusoidal components and random noise components due to air turbulence at the embouchure, can be very beautiful. So when used in this way, the characterization of a sound as "noise" is not a judgement of its desirability or lack thereof, but rather an evaluation of the amount of pattern and coherency it exhibits.

The pseudo-random numbers generated by a computer are actually derived by a systematic process. However, the process is so complicated and meaningless to us that we consider the results random. We can't follow the workings of the process in our heads, and even if we could, we probably wouldn't find that process and the patterns it generates to be meaningful or interesting. So it really is appropriate to think of a computer's pseudo-randomly generated numbers as random for our practical purposes.

In the previous examples, we used random decision making by the computer to choose from among a set of coherently related and moderately interesting possibilities: a set of twelve related musical chords, a set of six related images by the same painter, and a set of five sound files that all came from the same musical instrument. So the organization and aesthetic coherency that the composer/programmer put into the program is easily apparent when we run the program, but the ordering of the elements does not seem to have any sense of purpose or meaning because there genuinely is none; the ordering is random.

If the set of elements being chosen is stylistically neutral or unrelated, then we get an even less intentional effect because there is even less aesthetic control exerted by the programmer. Here is a program that shows some direct uses of random numbers to generate sound, music, and image, with almost no stylistic judgement or taste exerted by the programmer.

Of course, every decision, no matter how trivial or banal, is a judgement of some kind, and aimlessness or randomness or lack of personality could even be a desired trait in certain circumstances. In such cases, blatant exposition of randomness can be a useful means to that end. The point of these examples is to show randomness as a means of choosing a patternless series of values for sonic or visual characteristics that can only really be made meaningful by their organization. The result, as one might expect, is about as lacking in pattern as can be achieved from a steady stream of numbers.

In program No. 1, the program regularly chooses at random from one of 88 MIDI pitch possibilities and 89 MIDI velocity (loudness) possibilities. To put the numbers coming from the random objects into a useful range, we add an offset to them. The notes are offset by 21 to put them in the range of a piano (21 to 108) and the velocities are offset by 32 to put them in a range compatible with most synthesizers that expresses soft to loud (32 to 120). There is no patten to the sequences of pitches and loudnesses, and no relationship between them. Cognitively, we might "stream" the events into different levels or categories (such as the very loud notes, or the very low notes, etc.), based on gestalt psychology principles of perception such as proximity and good continuation, but any perception of pattern that we may have is coincidental rather than intentional.

Program No. 2 uses the MSP noise~ object to produce a constant stream of random numbers between -1 and 1 at the audio sampling rate (most commonly 44,100 samples per second). If you listen to that directly, it sounds like static. If you click on the message box with the number 2, to open up the second signal inlet of the selector~, you will hear a simple sinusoidal tone, but with randomly chosen frequency changing every 1/10 second. The snapshot~ 100 object samples the random noise stream once every 100 milliseconds, the abs 0. object forces the numbers to be non-negative, and then those numbers between 0 and 1 are scaled by 88.0 and offset by 21.0 to make a pitch selection in MIDI-like terms, which then gets converted by the mtof object into a frequency value for the cycle~ object. All of those objects handle the numbers as floating-point values, so that the fractional part of the number is preserved. This means that the cycle~ object can really play any of about a billion possible frequencies within the range of the piano, rather than just the 88 notes of the equal-tempered scale as in the MIDI example. If you wanted to limit it to the 88 notes of the equal-tempered scale, you could simply remove the decimal point from the + 21. object, which would cause that object to throw away the fractional portion of its result.

Program No. 3 uses random choices of color from the object's standardized palette of 256 colors, and chooses a random point (x and y coordinates chosen at random) to which to draw a straight line using that color. Since all of the ranges needed for this procedure start at 0 (0-255 for color, and 0-319 and 0-240 for x and y), there's no need to add an offset. The integers chosen by the random objects are immediately applicable for use in messages to the lcd object.

Notice that in all three of these examples, the range of the random numbers has been limited to keep the numbers within a range of "reasonable" choices. For example, the musical notes are kept within the range of a piano; exceptionally low or high notes, or exceptionally loud or soft ones are not permitted. For that matter, in the example that employs MIDI, the pitches are also limited within the equal-tempered twelve-tone scale. The random values of the noise~ object are automatically kept within the range from -1 to 1 expected by the DAC, which also happens to be an easily scalable range. To use those values for choosing frequencies, the range is scaled and offset to be in the range of the piano (although not limited to equal temperament as in the MIDI example). In the visual example, position coordinates are limited to stay within the size of the drawing area; lines do not go toward any offscreen locations. The colors are chosen from among a set palette of 256 possibilities. So just by choosing the ranges of the random numbers, the programmer has exerted a little "artistic" discretion, by deciding what constitutes a "reasonable" value in each situation.

The timing of events in each example -- as in all the examples given so far -- is constant, giving a mechanistic feeling to the programs. There is no reason, however, that the timing intervals, too, couldn't be subjected to random choices of values, introducing unpredictability into the time dimension as well.

These examples can be thought of as showing randomness and random decision making in a raw and relatively unadulterated form. In many future examples we'll continue to use randomness and noise to show how random numbers can be limited, controlled, weighted, and filtered to get more meaningful--yet still not fully predictable--results.

Sunday, August 31, 2008

Randomness

What does it mean to say that something is random? In general everyday usage it means a thing that occurs or is chosen without any particular bias, method, or conscious decision. In statistical usage it means equal likelihood of all possibilities. Both of those usages are applicable in the case of a card trick that begins, "Pick a card, any card." If the cards are presented in a neutral way, and the chooser is at liberty to choose any card, then all cards are equally likely to be chosen. And in most cases the chooser chooses at random, too, with no particular method or preference.

In all real world instances of randomness, we're not really talking about all possibilities; there is a limited range or field of possibilities. In a deck of cards, for instance, there are 52 possibilities (not counting jokers), each with a unique designation such as "three of hearts". So for each card there is a 1-in-52 chance of being the chosen card, and we know that it will have one of an expected set of designations. (There is no chance of, let's say, choosing a 57th card with the designation "seventeen of swords".) It is a limited number and type of possibility, but within those established limits randomness can occur.

True randomness is thus more of a concept than a reality. In computer programming, when we refer to random numbers, we actually mean pseudo-random numbers: numbers chosen from within a particular range of equally likely possibilities, by some system that is too complicated or obscure for us to comprehend, resulting in choices that appear to have no governing bias, method, or pattern. All programming languages contain a function for generating pseudo-random numbers--numbers within a particular range that appear to be completely unpredictable and to have no over-all pattern. (Mathematicians and computer scientists have devised many methods for generating pseudo-random numbers, but we won't concern ourselves here with the method of generation. We'll simply use the method provided for us by the programming language we happen to be using.)

Choosing a random number in a computer program (i.e., generating a number within a known set of possibilities by a pseudo-random process) is a way to simulate arbitrary decision making (a decision made without method or preference). It's also possible to program the computer to make arbitrary decisions using so-called weighted (i.e., unequal) probabilities, such that some numbers occur statistically more often than others (given a large enough statistical sample). We'll look at weighted randomness in another lesson. For now, we'll stick to random numbers of equal probability.

This program demonstrates some methods of random number generation in Max. It uses those random numbers to select sounds and images arbitrarily from limited sets of possibilities. For this program to work properly, you'll also need to download the very small audio clips and images that it uses. Just right-click (or control-click on Macintosh) on the following links, and save the files in the same directory as you save the program itself.
Images: gourmet.jpg, nubleu.jpg, brascroises.jpg, guitariste.jpg, tragedie.jpg, celestina.jpg
Sounds: bd.aif, tom.aif, snare.aif, hihat.aif, cymbal.aif

To begin discussing randomness in programming, let's stay with the "pick a card, any card" example for a moment. The key to the statistical definition of randomness, you'll recall, is that there is an equal likelihood of each possible outcome. In other words, there is an equal probability of each possible result. That leads us to the mathematical definition of probability: the number of looked-for results divided by the number of possible results. If we choose at random from 52 possibilities, there is a 1-in-52 chance of any particular looked-for result (such as the ace of spades, for example); that means the probability of choosing the ace of spades (1 looked-for result) out of all possible cards (52 of them) is 1/52, which is 0.019231. The probability of choosing any other particular card is the same. Note that by this definition of probability, the probability of any particular outcome (or set of looked-for results) can be expressed as a fraction from 0 to 1 inclusive, and the sum of the probabilities of all the possible results will equal 1.

If we put the chosen card back in the deck, and mix the cards up again, we'll once again have 52 cards, all of equal probability 0.019231. If on the other hand, we set the first chosen card aside instead of putting it back in the deck, and now choose from the remaining 51 cards, the first chosen card will now have a 0 probability of being chosen (it's no longer a possibility), and all the remaining cards will have a probability of 1/51, which is 0.019608. You can see that as we remove more cards from the deck, the probability of choosing any particular one of the remaining cards will increase, although all remaining cards will still have an equal probability. By the time we make our 51st choice, we'll be down to only two remaining cards, each with a probability of 0.5, and on the 52nd choice we'll have a 100% likelihood (a probability of 1) of choosing the one remaining card.

The distinction in the preceding paragraph between putting the chosen card back in the deck or not is an illustration of the difference between the random object and the urn object in Max. The random object chooses from a specified number of possible integers, with each choice being independent of any and all previous choices. The urn object also chooses at random, but it never repeats a choice; once it has chosen a certain number, that number is taken out of the set of possible choices (until the object is reset to its initial state with a clear message). Thus, urn avoids repetitions, and once it has chosen each possibility once, it stops choosing numbers, and instead sends a bang out its right outlet to report that all possible choices have been made. The random object, by contrast, always "forgets" its previous choice, and chooses anew from all of the numbers within the specified range.

In this example program, program No. 1 chooses randomly from a list of twelve stored chords every two seconds. The chords are composed in such a way that they have a clear root and all can be reasonably interpreted to have a function in C minor, yet they are sufficiently ambiguous and are voiced in such a way that any chord can reasonably succeed any other chord.

Because the ordering of the chords is chosen at random--truly arbitrarily--by the program, the harmonic progression sounds rather aimless. That's because it is, in fact. The program has no sense of harmonic function, no rules about one chord leading to or following another, etc., the way that a human improviser would. So this type of arbitrary decision making at this formal level can't really be likened to compositional or improvisational decision making that a thinking human would perform. It's useful for producing unpredictable ordering of a set of events, though, and its effectiveness varies at different formal levels and with different types of controls and applications, so we'll see different uses of random numbers in future lessons.

The random object needs to be supplied with a number that specifies how many possible numbers it can choose from. Since this random object has the argument 12, it will choose from among 12 different numbers, integers 0 to 11. The random object always chooses numbers from 0 up to one less than the specified limit. This results in the right number of possibilities, and the fact that the range starts at 0 makes it useful for accessing arrays, as we've seen in earlier examples. It also means that we can easily change the range by adding some number to the output of random. Because each random choice is independent of previous choices there is a possibility--indeed there is a 0.083333 probability--that it will choose the same chord twice in a row.

The chords are stored as twelve 5-note lists in coll, indexed with numbers from 0 to 11. When these lists come out of coll they get broken up into five individual messages by iter so that they can be sent out as five separate--but essentially simultaneous--MIDI notes.

Program No. 2 uses urn to choose a random ordering of six images. Like the twelve chords, the images have been selected because they are related--they're all images of human subjects from Picasso's blue period--but there is no aesthetic or logical reasoning behind the order in which they're presented. The urn object ensures that there are no repetitions, and after it has chosen each of the six the possible numbers from 0 to 5, the next time it gets a bang from the metro it sends a bang out its right outlet to indicate that all possibilities have been expended. In this program, we use that notification to turn off the metro and to send a clear message back to urn to get it ready for the next time.

The program initially reads into memory each of the images we want to display, and assigns each of them a symbolic name. Those names are also stored in a umenu object so that they can be recalled with the numbers 0 through 5. When urn sends out a number, umenu uses the number to look up the name stored at that index in its internal array, and sends the name out its middle outlet. The prepend object puts the word drawpict before the name, so that a message such as drawpict guitariste will go to the lcd object and draw the picture.

Program No. 3 shows a slight variation on the sort of random selection used in No. 1. It plays very short sound files chosen at random from five possibilities, but never plays the same file twice in a row. Each time that a random number comes out, it goes to a select object to be compared to the previous number. If it is not the same as the previous number, it is passed out the right outlet of select and used to choose a sound. However, if it is the same as the previous number, select sends a bang back to random to try again. The select object is initialized with the argument -1 so that the first choice by random, which we know cannot be -1, will never be rejected. So after the very first choice, instead of there being five possibilities, there are actually only four, because we know that the preceding number will be rejected if it's chosen again immediately. It's still random, but with a rule imposed after the choice, a rule that rejects repetitions and keeps retrying until it gets a new number.

[N.B. This programming technique of using an object's output to potentially trigger immediate new input back to its left inlet is not generally advisable in Max, because it could cause a "stack overflow", a situation in which Max is required to perform too many tasks in too short a space of time. However, in this case, the probability that random would choose the same number enough times in a row to cause a stack overflow is minuscule.]

The program initially opens each of the different sound files we will want to play and stores a pointer to each of those files as a numbered "cue". Because of the way that the sfplay~ object is designed, the sound file most recently opened with an open message is considered cue number 1, and other numbered cues can be specified with the preload message. There is no cue number 0 in sfplay~; that number is reserved for stopping whatever cue is currently playing. Therefore, what we really need in order to access the five cues in sfplay~ is not numbers 0 through 4, but rather numbers 1 through 5. This is easy to achieve simply by adding 1 to every random number before passing it on to sfplay~ to be used as a cue number. (This addition is another example of an offset, as demonstrated in the earlier examples on linear mapping.)

Program No. 4 just demonstrates a handy trick for causing urn to keep outputting numbers once it has chosen all the possibilities. You can use the notification bang that comes out of urn's right outlet to trigger some new action, so in this example we use it to trigger a clear message back to urn itself, then retry with a new bang to urn. Note that this leaves open the possibility of a repetition between successive numbers, since the first new number that urn chooses after it is cleared could possibly be the same as the last number it chose before it was cleared. (The probability of an immediate repetition occurring in this way is inversely proportional to the number of possibilities.)

Notice that the complete randomness produced by random leaves open the possibility of some improbable successions of events occurring. Unlikely short-term patterns, such as 2,2,2 or 1,2,1,2 are possible, especially when the total number of possibilities is relatively small. So random is useful for generating unpredictable results, but that includes the possibility of improbable distinctive successions. The urn object avoids such successions that involve repetition of a number, but it becomes more predictable as its number of possible choices decreases. (We know that it won't choose a number that it has already chosen.)

The artist-programmer should determine what sort of randomness, if any, meets the aesthetic goals of a particular situation. Future lessons will show other uses of random numbers for decision making.

Thursday, August 21, 2008

Intuition

There are some decisions that we make and we know how we made them; we can fully describe the process by which we arrived at a particular decision. There are other decisions that we make (and feel confident it's the necessary or right decision) without being able to describe systematically the process by which we arrived at that decision. And there are decisions that we make arbitrarily, either because all possible choices seem equally valid, or because the decision is too trivial to spend time thinking about, or perhaps because at some very low level there truly is an element of randomness or chance that plays a role in the course of events.

We can name these three modes of decision making systematic, intuitive, and arbitrary. System is a prescribed set of procedures to be followed to arrive at a result. Intuition is a way of knowing that is not formulated as a system but is nonetheless effective. Perhaps it is a system that has not yet been formulated, or perhaps it is a wholly different way of knowing. Arbitrary decision making needs no special system because all results are equally acceptable. In reality we probably make decisions using some complicated combination of the three modes.

There is a lot that goes on in the working process of most composers that they would not be able to formalize as a rule-based or procedure-based system. Most composers use a combination of systematized or quasi-systematized knowledge, intuition, and probably at low, trivial levels, arbitrary decision making.

This set of essays is concerned primarily with systematic decision making, which is the type that lends itself most readily to computer programming. But let's take a look at how system relates to intuition and chance.

I have proposed here a use of the word "intuition" that might not be agreeable to everyone, but will be the working definition for the purposes of this essay: a means of decision making that is intentional and in which we have confidence, but for which we have not formalized a system. In short, things we know but don't know how we know them. It might be that we are in fact using a system of which we are not fully conscious, or it might be that it's a different sort of knowledge that is not encompassed by rationalist logic.

To teach a computer to use this sort of intuition would seem to be inherently impossible if, by definition, we don't know how to explain intuition. And indeed, for a computer to make autonomous decisions that aren't fully pre-determined by the system that is its software requires that at some level its algorithm must use a form of randomness, which is to say arbitrariness. A computer can readily enact a fully described system of procedures, and it can also enact arbitrary decisions using pseudo-random processes. But how can a computer enact or emulate intuition?

If we accept the premise that intuition is "things we know but don't know how we know them", then we could investigate intuition by trying to figure out how we know the things we know intuitively. To the extent that we can describe or emulate intuition by a formal system, we can gain insight into the nature of intuition. If one could imitate intuition with increasingly probing systems, until we arrive at a level where an arbitrary decision can be shown to be at the lowest level, we can show that valid compositional methods might be totally systematized, with unpredictability and variety provided by pseudo-randomness.

Why does a composer or any artist choose one thing and not another? I propose that that "why" is ultimately reducible to a complex algorithm of "hows". That is to say, we may consider the explanation of why something is the way it is (Why do I like chocolate ice cream better than strawberry?) to be equal to the explanation of how that state was achieved. (By what mental process do I arrive at the discernment that chocolate is preferable?) The idea that decisions can be explained algorithmically is at the very heart of the field of algorithmic composition. Computers only know how to do things. They carry out instructions with no inkling or concern as to why they are doing them. Therefore, the business of programmers of artificial creativity is to turn whys into hows.

Let's take the example of a composer selecting a pitch to write on the page. Assuming that the composer has already decided to use only the 88 possibilities presented by the piano (or 89 if we include the "null" note, silence), some criteria for decision making are obviously necessary. A number of aesthetic criteria may be used by the composer in choosing a pitch: melodic contour, harmonic implications, etc. But the choice need not necessarily be based on aesthetic criteria. The composer may have a pre-established system (an algorithm, a list, etc.) or the choice may be made arbitrarily (by aleatoric means). In these instances the composer would simply be following established rules of decision making, which is something that computers do better and faster than humans. But the existence of those rules implies some prior aesthetic decision, either of commission or omission. An algorithm is being used because the composer decided at some earlier time that that algorithm would lead to a desired aesthetic result. How did the composer arrive at that decision? That previous aesthetic decision was presumably made using one of those same three means: systematic (using a system that is itself based on earlier aesthetic decisions), intuitive (using a system that has not yet been fully and consciously formalized), or arbitrary (using some unknown criteria or no criteria). So we see that rule-based decision making can always be traced back to some prior choice, either systematic or arbitrary.

When we try to trace aesthetic criteria themselves back to prior choices (By what criteria did we decide to use those criteria?) we may finally arrive at some seemingly banal conclusion such as "I don't know" ("I made the decision intuitively") or "It didn't matter" ("I made the decision arbitrarily"). The type of conclusion we reach in this genetic reconstruction of a compositional decision has implications of how to proceed to enact a similar decision by computer. A seemingly intuitive decision might be elucidated by further analysis of the underlying system. A seemingly arbitrary decision implies that randomness can be the source of desirable aesthetic results.

If we justify an intuitive aesthetic decision with "I just like it that way" we invoke an attribute called taste. Taste is a much-used term to describe a trait or criterion of aesthetic decision making, but no conclusive definition of taste has really been established. So far we don't know of a way for a computer to exercise genuine human taste or intuition, but randomness (or a very good facsimile thereof) and procedural systems are no problem at all for a computer.

Almost all computer programs that make autonomous decisions employ randomness on some level. Total randomness--also known as "white noise"--is rarely of aesthetic interest to most of us for very long. We tend to desire some manifestation of an ordering force that alters the predictably unpredictable nature of white noise. To produce anything other than white noise, a computer program must contain some non-arbitrary choices made by the programmer. Therefore, no decision-making program can be free of the knowledge and, yes, taste and intuition of the programmer.

For these reasons, we can analyze algorithmic composition--the process of programming a computer to make music using systematic and arbitrary procedures--as a complete and potentially fruitful method of composing music and/or other arts.

Algorithmic Composition

Blog Archive

About Me