Algorithmic Composition: mapping

Wednesday, April 1, 2009

Line-segment control function

In the article on fading by linear interpolation, you can see a demonstration of how a particular characteristic of a sound or an image (such as the amplitude of a sound or the brightness of an image) can be modified gradually over time. A word that's important for this type of operation is parameter, which means a numerical descriptor of a characteristic. For example, amplitude and brightness can each be controlled by a single number representing "gain" (the factor by which we turn it up or down). In that example using a "gain" factor to control amplitude or brightness simply involves multiplying the signal (the thing you want to modify) by the gain factor. In the case of audio, we multiply every single individual sample of the audio signal (tens of thousands of samples per second) by the gain factor; in the case of video we multiply every color value of every pixel of every frame by the gain factor. (That is exactly what the "brightness" operation is doing internally in the jit.brcosa object.) In each case, a single number sets a precise amount by which we modify a particular parameter. When discussing a sound or an image or a musical passage or a video, there are often many characteristics that can be usefully described by a number. When you get right down to it, it's usually possible to convert nearly any description into one or more numbers somehow, and once you've done that, the description can then be manipulated by arithmetic operations.

So, as we saw in that example, when a parameter changes over time, it can create an interesting change in aesthetic effect (such as a fade in or out). The change in that case was linear and directional. That makes for a simple yet clear and direct type of change. You can think of the straight line as one simple kind of shape imposed on the characteristic being controlled. Other shapes such as smooth curves or irregular patterns are also possible.

Before we go on, let's define a couple of words: the nouns control and function.

A control is something that we don't perceive directly, but the effect of which we can perceive when it's applied to a parameter. For instance, in the fading by means of linear interpolation example, we don't literally see a line or hear a line, but we perceive the linear effect when the line is applied to the gain factor that controls brightness and amplitude.

A function is a defined relationship between two variables. Let's call those variables x and y, which could stand for anything. In general one variable, x, stands for something "given" or "known" (an example might be time, which we can know with some accuracy using a clock), and the other variable, y, stands for something the value of which will depend upon the value of x. We say that "y varies as a function of x", which means that there is a known relationship that permits us to know the value of y if we know the value of x. Often the relationship between x and y can be perfectly described by a mathematical equation that contains two variables, x and y. That's what mathematicians generally mean when they use the word function: an equation that permits you to calculate the value of y for every possible value of x you might put into the equation. That's what's being described by the examples in the article on the mathematical manipulation of digital media. The formulae such as y=x, or y = Asin(2πƒt+ø)+d, or y=mx+b are examples of functions in which the value of y depends on the value of x in a way that can be reliably calculated. If we plug many different values of x into the equation and calculate y for each one, and graph the results with x on the horizontal axis and y on the vertical axis, we'll get a shape. That shape is called the "graph of the function". But a mathematical equation is not the only way to define a relationship between two variables.

A function could also be a shape that is not easily described by a mathematical equation, and we would discover the value of y by mapping it to its corresponding x value on the graph. Another way would be to actually have a listed series of all possible x values and the y values that correspond with them. These methods may take a bit more memory to store a complex shape or a list of x,y pairs, but a) they allow us to use shapes that are not easily described mathematically and b) rather than requiring calculation, they just require a quick lookup of the y value based on the known x value.

But regardless of the precise method of establishing the relationship between the known variable and the unknown variable--whether it's done by a calculation or a lookup--one important characteristic of a function is that it describes a knowable one-to-one relationship (of any degree of complexity) between x values and y values.

So, by combining those two words, we arrive at an expression that is used frequently in audio sound synthesis, and which, as we will see, can also be used in algorithmic composition: control function. A control function is a shape that is used to control a parameter in sound or music or video or animation. Most commonly x is the passage of time, and y is the value of some parameter over that period of time. All kinds of shapes are potentially useful as control functions.

Straight lines

Curves

Trigonometric functions

Random or arbitrary x,y pairs

Freehand drawn shapes

Combinations of line segments

We'll look at how some of these control functions can be used to control or modify sounds, and then we'll transfer some of that thinking into the control of attributes of a musical structure or an animation. We'll start with line segment shapes such as the one depicted directly above.

The article on linear change introduces some of the math involved in making a formulaic description of linear change over time, and the article on linear interpolation introduces the handy Max object called line. The line object lets you just specify a destination value (the value you want to get to), a transition time (how long you want to take to get there), and a reporting interval (how often you want it to send out intermediate values along the way there), and it sends out a timed series of values that progress linearly from its current value to the destination value in the specified amount of time. Max also provides a line~ object for doing the same thing for an audio signal. The messages you send to a line~ object differ from those of line in two significant ways. First of all, there is no argument for the "reporting interval" in line~ because line~ sends out an audio signal, and every single sample of that signal reports an intermediate value interpolated between the starting value and the destination value; in effect, the reporting rate is the same as the audio sampling rate. So all line~ really requires is two numbers: the destination value and the transition time. The other difference is that line~ can receive multiple value-time pairs in the same message, all as part of the same list. For example, a message such as '1. 1500 0.5 500 0.5 2000 0. 6000' will cause line~ to send out a signal that goes to 1 in 1500 milliseconds, goes from 1 to 0.5 in 500 milliseconds, stays at 0.5 for 2000 milliseconds, goes to 0 in 6000 milliseconds, and stays at 0 until it receives a message causing it to go to a different value. In this way a single message can describe a function over time made up of several straight line segments.

The function object allows the user to draw a line-segment shape of this sort. When the object receives a 'bang' it sends out such a message (intended for line~) that will cause line~ to send out that function shape. The minimum and maximum of the range (of the y axis) of the function can be set by a 'setrange' message, and the duration (of the x axis) can be set by a 'setdomain' message.

This patch shows the use of line segment control functions to shape the frequency and amplitude of a tone. When you choose a duration for the function, that number is used to set the domain of the function objects (and also uses whatever values have been chosen for the minimum and maximum of the frequency and amplitude ranges).

For the amplitude function, a shape has been chosen that is similar to the amplitude envelope of many instruments when the function takes place over a duration of about 500 to 2000 milliseconds. For the frequency function, a shape has been chosen that results in three up-down glissandi that increase in range and duration. Try listening to these control functions at different durations. When played very slowly over 10 seconds, the envelopes are clearly audible as gradual frequency glissandi and amplitude changes. When played over a quicker duration such as 1/2 second, the amplitude envelope sounds quite natural and the glissandi are quick and almost melodic. When played extremely fast, over 1/10 of a second, the glissandi are too fast to perceive as such, and the effect is mostly timbral. Try also changing the frequency range values to see what effect occurs when the range is very small or very large.

The point here is that the same functions can be "stretched" (augmented or diminished) over a variety of durations and/or ranges to create a wide variety of sonic/musical effects without changing the basic shape of the control functions. One can thus think of a distinctive control function shape as being analogous to a musical motive or formal structure. The same motivic shape can occur over a long period of time (phrase level), or a short period (note level), or an extremely short period (for timbral effect). In this program, one can also simply draw a new control function with the mouse, to create a new motive.

Tuesday, August 26, 2008

Linear change

"The shortest distance between two points is a straight line." That mathematical truism recalled from our geometry class is often quoted as a sort of proverb, reminding us that the most direct way to get somewhere is usually to head straight for it. The straight line stands for directness, and we can easily perceive its direction and predict where it will arrive if it continues. Thus, straight lines are useful for describing gradual-yet-direct change. If we draw a straight line from point A to point B, we're getting there by the most direct route, and we're also drawing every intermediate point that lies directly between A and B.

In the previous lesson, we provided the formula y=mx+b as a general way to describe any line. That's how it's given in arithmetic textbooks, but we'll need to rephrase it a bit for our purposes. For starters, how do we calculate the slope m? Well, for any two points--we'll call them A and B--we can calculate the slope of the line that runs through them (and thus the line segment that connects them) by dividing the vertical distance between them by the horizontal distance between them; that is to say by dividing the difference in their y values by the difference in their x values. So if the coordinates of point A are (xa, ya) and the coordinates of point B are (xb, yb), then the slope m is equal to (yb-ya)/(xb-xa). To refer to the last example from the previous lesson, the first point on the line segment is (0,36) and the last point is (100,96). so the slope of the line is (96-36)/(100-0), which is 60/100, which is 0.6. Thus, by knowing the slope m (0.6) and the offset b (36), we can calculate what y value will lie on the line for any x value we put into the formula.

In practical terms, for the purpose of programming linear change in sound, music, video, animation, etc., we'll need to know those values, or at least we'll need to be able to calculate them. (In the above example, we were able to calculate the slope because we knew the starting and ending x and y values.) Then, by starting at the desired value for x (a starting point in time) and proceeding to a desired destination value for x (a future point in time), we can calculate the values for y for as many intermediate x points as we want, to give the impression of a linear change in y. Before we look at an example, let's consider two other terms that are commonly used in digital media arts, which have direct relevance to this definition of a line.

-----

The term linear interpolation between two points A and B means finding an appropriate intermediate point (or points) that would exist if there were a straight line between A and B. To continue with the example we've been using, if we have the two points (0,36) and (100, 96), we can interpolate one or more points between them by calculating the y value at a hypothetical x value between 0 and 100. For example, just by using the formula and the known values for slope and offset, we can calculate that when x equals 20 y will equal 48, and when x equals 80 y will equal 84.
(0.6)20+36=48
(0.6)80+36=84
So, for any point between A and B, we can interpolate one or more additional points that will lie on a straight line segment between them. Another way to say this is that for any intermediate x value, we can find the appropriate corresponding y value.

Tangent: There's another way to think of linear interpolation, which is to think of "how far along" the intermediate x is, on a path from point A to point B. In other words, for the hypothetical x value, how far is it from its starting point xa, and how far is it from its destination point xb? We can actually calculate it as a fraction between 0 and 1 by calculating its distance from xa, which will be x-xa, and dividing that by the total distance between xa and xb, which will be xb-xa; so the fraction of the distance that a hypothetical x is on the path from xa to xb can be calculated with the expression (x-xa)/(xb-xa). So another way to think about calculating the y value for a hypothetical x value, is to multiply the destination y (yb) by that fraction because we're that fraction of the way there, and multiply the starting y (ya) by 1 minus that fraction. The equation for finding the y value that corresponds with a hypothetical x value is thus y=(yb(x-xa)/(xb-xa))+(ya(1-((x-xa)/(xb-xa))). Or, to put it a bit more simply, y = ((x-xa)(yb-ya))/(xb-xa)+yb. That's a valid formula for linear interpolation, or linear mapping of x to y.

This process of linear interpolation is thus very closely related to another term, linear mapping, which means making a direct correlation of an x value to its corresponding y value. If we have a given range of x values (say, from xa to xb), and a corresponding range of y values (say, from ya to yb), then for any value of x we can calculate the linearly corresponding y value. This is called mapping x values to y values. (In theory, there could be a wide variety of curved or non-linear "maps" by which we make these correlations between x values and y values, however, for now we'll stick to linear mapping.) The formula for linear mapping, if you needed to program it yourself, is shown in the preceding paragraph. Mercifully, Max provides many objects that can calculate mapping for you, such as zmap, scale, etc.), but it's important to understand mapping conceptually since it is key to all sorts of programming in digital media.

-----

Now let's look at a simple example of linear change. This program plays musical notes every 80 milliseconds (i.e., at the rate of 12.5 notes per second) for 2 seconds (2000 milliseconds). Over the course of those two seconds, the pitch of the notes changes linearly from MIDI 36 (low C) to MIDI 96 (high C) and the MIDI velocity (loudness) of the notes changes linearly from 124 (fortissimo) to 32 (piano). The result is a program that plays an ascending pentatonic scale, with a diminuendo.

Note that the program plays a major pentatonic scale, which is not a strictly linear configuration. (Some steps are 2 semitones, and some are 3 semitones.) Because we're stepping through five octaves of pitch in 25 increments, each step should be 1/5 of an octave. However, because MIDI does not allow for fractional values, the decimal part of the intermediary pitch values is truncated (chopped off), so ultimately not all the steps are exactly the same size. The fortuitous result of this truncation is the pattern that gives a major pentatonic scale. There's no magical mathematical relationship between truncation and this particular pitch pattern. It's just a fortunate (calculated) coincidence of the range, the number of notes being played within it, MIDI representation of pitch, and truncation of the fractional part of the number.

Let's look at a couple of details in the program. The clocker object reports at regular intervals the amount of time elapsed since it was turned on. This is similar to the timed counting demonstrated with the metro and counter objects, but in this case it is counting in increments of a certain number of milliseconds, corresponding exactly to the amount of time that has passed. This is handy because it allows us to check each report to see if a certain amount of time has passed--in this case 2000 milliseconds--and do something at that time (in this case, stop the process). Since we know the desired stopping time (that is to say, the destination x value) we can also use each reported time to calcuate how far along we are to the destination time, and use that fraction for linear mapping of time to the pitch and velocity values.

We can divide the elapsed time (x) by the total time (2000) to get a fraction between 0 and 1. We then multiply that by the range of the desired y values (yb-ya), and add the desired offset (ya) to it. (We calculated the range of desired y values by subtracting the starting value from the ending value, i.e., yb-ya.) That's what's happening in the two expr objects. In the case of pitch, ya equals 36 and yb equals 96 so the range is 96-36, which is 60. In the case of velocity, ya equals 124 and yb equals 32 so the range is 32-124, which is -92. That's where the range values 60 and -92 come from in the expr objects. The numbers 36 and 124 in the expr objects are the ya (offset) values. The number boxes (because they are integer number boxes) drop the fractional part of the output from the exprs.

This direct use of the y=mx+b formula inside an expr object is just one way to do linear mapping in Max. The line object does automated linear interpolation, kind of like this combination of clocker and expr shown here, and other objects such as zmap and scale calculate mapping of x and y values if you provide the points A and B.

Linear motion, linear change, linear interpolation, and linear mapping are frequently used in composition and digital media.

Algorithmic Composition

Blog Archive

About Me

Wednesday, April 1, 2009

Line-segment control function

Tuesday, August 26, 2008

Linear change

Labels