Leandro is a piece of software that explores different methods of audio synthesis and different audio filters, written entirely in C++, using portaudio and Qt.
It synthesizes notes from MIDI files using its own instruments and audio filters. After importing a .MIDI file, the user can choose from the different synthesizing methods and combine any number of filters and the melody will be generated accordingly.
Everyone loves images, so here's a screenshot of Leandro's main screen in action:
Leandro also supports multi-channel MIDI files, and allows for each channel to have its own synthesis method and assortment of audio filters:
After experimenting with different "instruments" and filters, you might like what you hear: the session's recording can be started and stopped at any point, and .wav files can be saved with the contents of those recordings.
- Fast Fourier Transform
- Sample-Based Synthesis
- Additive Synthesis
- Sound synthesis using physical models
- Digital Effects
- Program implementation
- Contact
- License
To apply the Fast-Fourier-Transform algorithm, we used the Decimation-in-time algorithm, better known as the Cooley-Tukey algorithm, which will be developed below.
Starting from the assumption that what you want to obtain is,
Taking this into account, and knowing that
Therefore, if we call
Finally, given the periodicity
We obtain that,
Finally, from the expression above, we can see that by saving the values obtained for half of the FFT interval, we can reuse them to calculate the missing half. The Cooley-Tukey algorithm uses this property, recursively subdividing the odd and even parts until we arrive at a function with only one element.
The implementation of the algorithm can be seen below.
void fft(std::complex<float>* in, std::complex<float>* out, size_t n) {
vector<complex<double>> temp;
for (size_t i = 0; i < n; i++)
temp.push_back(in[i]);
temp = Cooley_Tukey(temp);
for (size_t i = 0; i < temp.size(); i++)
out[i] = temp[i];
return;
}
std::vector<complex<double>> Cooley_Tukey(std::vector<complex<double>>& samples) {
unsigned int numSamples = samples.size();
if (numSamples <= 1)
return samples;
unsigned int newSampleSize = numSamples / 2;
std::vector<complex<double>> Xeven(newSampleSize, 0);
std::vector<complex<double>> Xodd(newSampleSize, 0);
for (unsigned int i = 0; i < newSampleSize; i++) {
Xeven[i] = samples[2 * i];
Xodd[i] = samples[2 * i + 1];
}
std::vector<complex<double>> Feven(newSampleSize, 0);
std::vector<complex<double>> Fodd(newSampleSize, 0);
Feven = Cooley_Tukey(Xeven);
Fodd = Cooley_Tukey(Xodd);
std::vector<complex<double>> freqBins(numSamples, 0);
for (unsigned int i = 0; i < (numSamples / 2); i++) {
complex<double> cmplxExponential = polar(1.0, (-2 * 3.14159 * i / numSamples)) * Fodd[i];
freqBins[i] = Feven[i] + cmplxExponential;
freqBins[i + numSamples / 2] = Feven[i] - cmplxExponential;
}
return freqBins;
}
Comparing the results for the data sequence expressed in the equation below, using the python FFT function yielded the results shown in the figure below.
Sample-based synthesis is very different from other forms of synthesis such as additive synthesis. As seen above this type of synthesis uses basic functions such as sine, square, triangle or sawtooth to synthesize different sounds. However, sample-based synthesis uses a sample to recreate another sound. This sample, henceforth referred to as a sample, can be a recorded sound or instrument. Consequently, this form of synthesis allows you to recreate a more realistic sound since you are working on a real sound. In addition, it is less demanding on the CPU than other forms of synthesis, such as synthesis using physical models, since the output signal is not calculated in real time but pre-recorded samples are used. However, this type of synthesis involves a similar difficulty. Suppose you want to synthesize a piano. By this method it is possible but it would require to have pre-recorded all the notes of the piano with all the possible durations. This is clearly impossible since the memory is finite. For this reason, algorithms are used to perform time and frequency stretching and compression to form from a pre-recorded note another note with different frequency and different duration. This is the objective of this section, to be able to determine an appropriate algorithm so that, from a recorded note, another note of different frequency and duration can be created. In this way, with just a few samples, we can synthesize an instrument. Taking all this into consideration, it is proposed to synthesize a piano.
First, the possibility of modifying the pitch of a note is analyzed. The pitch is one of the four essential qualities of sound. It determines whether a sound is high-pitched or low-pitched. The higher the frequency, the higher the pitch, and the lower the frequency, the lower the pitch. The Pitch is determined by the fundamental frequency of the signal.
Understanding these basic notions, the objective of this sub-section is defined. Starting from a sound with a certain Pitch, such as a piano note, we want to modify its fundamental frequency in order to generate another note with the same duration as the original one. This is possible by means of a frequency shift or pitch shifting. This process involves two steps: first, the pitch must be identified (Pitch Detection) and secondly, the pitch shift or correction must be made (Pitch Correction).
The Pitch of a signal can be analyzed in the frequency domain as well as in the time domain. The Figure above shows a simple sample of the note C3 of a piano. Note that, the frequency domain plot tells much more than the other plot. You can clearly see the peaks of the signal. The most predominant peak determines the fundamental frequency and, as stated above, the fundamental frequency is the ÒpitchÓ of the signal. For this particular signal, the Pitch is
Typical detection methods include autocorrelation methods or by means of magnitude difference averaging. These methods are very interesting if you do not know the signal at which pitch shifting is to be performed. However, since the objective of the section is to synthesize notes from a piano, the different Pitch are already known beforehand. Recalling music theory concepts, the frequency of the $k-th note of a piano is given by the equation:
For example, note C3 which is note number 28 has a fundamental frequency or Pitch of
The following is an analysis of a Sample to demonstrate the theory described above. The analysis is performed with the free program Audacity. This program allows to perform the frequency spectrum of a Sample. Figure below shows the spectrum of a Sample of the note C4 of a classical electronic piano. The Sample was acquired using the GrageBand program.
As can be seen in the Figure above, the fundamental frequency is
As a last clarification on the subject. Do not confuse Pitch which has units of Hz and is observed in the frequency domain with the Pitch Period which is the reciprocal of the Pitch, has units of second and is observed in the time domain.
In the previous subsection we analyzed how to determine the pitch of a Pitch of a Sample. Now we analyze how to modify or, better said, correct the Pitch of a Sample to generate a new pitch. Correction algorithms can work in the time domain or in the frequency domain. Generally those working in the frequency domain are very robust but suffer from long latencies. Therefore, a time domain method is used.
Most of the techniques that work in the time domain to perform pitch correction are based on the TD-PSOLA method. TD-PSOLA (Time-Domain Pith Synchronous Overlap and Add method). This is the method used in this section because it is fast, simple and, as will be shown later, has very good results.
The TD-PSOLA method can be divided into three steps:
* Decompose the input signal into a series of short-time analysis signals.
* Modification of the short-time analysis signals to short-time synthesis signals
* Composition of the new signal with the short time synthesis signals. Here is the concept of overlap-add
Step 3 is where the interesting thing happens. When recomposing the signal, the short time synthesis signals can overlap or move away from each other. This causes the fundamental frequency of the signal to vary. If the signals move closer together, the fundamental frequency increases (higher pitch). Conversely, if the signals move away from each other, the fundamental frequency decreases (lower Pitch). An illustration of the method is shown in the Figure above to understand how it works. Keeping in mind the steps of the method, we continue to develop each one in detail.
This step consists of decomposing the input signal
Where
Returning to the analysis window, its length
The second step consists of transforming the sequence of short-time analysis signals into a sequence of short-time synthesis signals. The latter are synchronized with a new set of time instants
Composition of the synthesized signal. We already have the synthesis pitch-marks so it would be necessary to center each short time synthesis signal to these time instants. But a small problem arises. If the duration of the original signal is to be maintained, a synthesis signal may have to be repeated in two or more synthesis pitch-marks. Then, to determine which signal is repeated we simply calculate which synthesis pitch-mark is closest to the synthesis pitch-mark being evaluated. In this way the synthesized signal is 'filled in'. Depending on the value of
Figure above shows the method on a Sample of C4 of a classical electronic piano to synthesize C4 (fundamental frequency one semi-tone higher than C4). In the first graph, the peaks or pitch-marks of the signal separated approximately by the Pitch Period are detected. In the second graph, the windows centered on the pitch-marks are placed. The third plot shows short time analysis signals already modified by the window. The fourth graph shows the signal synthesized with the synthesis pitch-marks. It would remain to be determined whether the fundamental frequency of the synthesized signal is indeed that of C4 (
In conclusion, pitch shifting was successfully achieved on a Sample. It was possible to perform the shift of the fundamental frequency of the sample while maintaining the duration of the sample. As additional considerations, it is possible to work using different windows such as Barlett or similar and with different width to analyze the different results.
Second, temporal scaling is analyzed. The method to perform the scaling is very similar to the method to perform the pitch shifting. The difference is that now we are not interested in changing the Pitch but, what we want is to modify the time of the signal. To achieve this, the TD-PSOLA method is also used. The steps of the method are the same as those described in Pitch Correction and others suffer some changes. Step 1 remains exactly the same, the signal is decomposed into a series of short time analysis signals. Step 2 undergoes a small correction. Remember that now you want to have the same Pitch so the distance between each synthesis signal must have the same distance from each other as the analysis signals. This would be
Figure below shows an illustration of the time modification method.
This section explains the implementation of the previously described methods to achieve the synthesis by Òtextit{samples} of a piano.
First, the samples or samples were searched. To synthesize a piano we consider using the samples of the notes C1,C2,C3,C4,C5,C6. This allows us to synthesize all the samples since we have one sample per octave. That is to say, if you want the note D3 you will use the sample C1 to make the corresponding frequency shift. If A5 is desired, use C5 and so on. As you can see, each octave sample can synthesize 11 notes (each octave is separated by 12 notes). Once we have the samples, we continue to make a spectral analysis of each one to determine its fundamental frequency. Having everything necessary, we continue to apply the methods previously explained.
Figure below shows a modification of a C4 note. The first graph shows the original signal, the third one shows C4 with a two semitone shift and the last one shows a temporal lengthening of the sample. The second graph shows something interesting. Notice that fragments of the signal are repeated to fill the time. This was implemented this way because this is how sample synthesizers work. The user is always given the option to select the fragment of the signal to be repeated in case you want to synthesize a signal with a longer duration than the original signal.
It was possible to understand the basics of sample synthesis, its operation and implementation. It was possible to perform the time shift and the fundamental frequency to synthesize any note of a piano.
With the knowledge acquired, it was possible to synthesize several instruments such as a steinway piano, an electric piano, a guitar, among others.
Additive synthesis is based on the concept of Fourier Series, specifically on the fact that a periodic signal can be written as a sum of sinusoidal signals of different frequencies and amplitudes:
The expression above allows generating uniform periodic functions, but these signals do not faithfully represent the signal generated by an instrument. Each instrument has a certain tone, that is, its sound has a characteristic harmonic power distribution. In addition to a characteristic tone, the sound of the instrument is characterized by the amplitude envelope that modulates its sound.
Then, the signal of an instrument can be modeled as a sum of signals of different frequencies which we will call partial, since they do not necessarily have to be integer multiples of the central frequency of the signal as indicated by the expression above. Furthermore, these coefficients
Considering the above, two additive synthesis models were proposed in this report. The first of them, which we will denote ADSR, proposes that the envelope that modulates in amplitude is the same for all the partials that make up the signal. The second proposal assumes, as stated in te expression above, that each partial is modulated by a different envelope
It was proposed to implement the additive synthesis of 4 instruments: piano, clarinet, trombone and trumpet. For each of these instruments, audio samples were taken in .wav format, and a very selective band-pass filter was applied to them, so that the envelope signal
This methodology was repeated for a sample of each octave belonging to the set of octaves that the instrument is capable of reproducing, given that when you want to synthesize a note of, for example, the 6th octave, its synthesis will be more faithful if its parameters were obtained of a sample corresponding to that octave and not, for example, to the 1st octave.
In order to explain the methodology implemented, the method for the note A belonging to the 5th octave for an acoustic piano will be developed. First, the envelope curves were taken from a sample of a .wav file for the first 12 harmonics, that is, for
Figure above shows the distribution of the envelopes for the 12 selected partials. Then, when synthesizing a note, the expression was applied using the
Below is the Python code used to obtain the mentioned data:
import numpy as np
from numpy import sign
from scipy import signal, interpolate
import matplotlib.pyplot as plt
import math
from scipy.io import wavfile
from distutils.spawn import find_executable
import aifc
import xlsxwriter
import math
from scipy import fftpack
file = 'samples/A5-880.0.wav'
oct = 5
if find_executable('latex'):
plt.rc('font', **{'family': 'serif', 'serif': ['Palatino'], 'size': 17})
plt.rc('text', usetex=True)
peaks = []
datas = []
freqs = []
octave_envelopes = []
rate, data = wavfile.read(file)
d = data.astype(float)
d = (data[:,0] + d[:,1]) / 2
maxD = max(d)
data = [i/maxD for i in d[0:441000]]
t = np.arange(len(data))/rate
f0 = float(file.split('.')[0].split('-')[-1])
plt.plot(data)
plt.show()
for i in range(1,13):
fi = i * f0
freqs.append(fi)
fs1 = 0.9*fi
fs2 = 1.1*fi
ws = [fs1, fs2]
fp1 = 0.95*fi
fp2 = 1.05*fi
wp = [fp1, fp2]
N, wn = signal.ellipord(wp, ws, 0.5, 50,fs=rate)
sos = signal.ellip(N, 1, 100, wn, 'bandpass', output='sos',fs=rate)
filtered = signal.sosfilt(sos, data)
filtered = np.array(filtered)
q_u = np.zeros(filtered.shape)
u_x = [0,]
u_y = [filtered[0],]
for k in range(1, len(filtered)-1):
if (sign(filtered[k]-filtered[k-1])==1) and (sign(filtered[k]-filtered[k+1])==1):
u_x.append(k)
u_y.append(filtered[k])
u_p = interpolate.interp1d(u_x, u_y, kind='cubic', bounds_error=False, fill_value=0.0)
envelope = []
for k in range(0, len(filtered)):
envelope.append(u_p(k))
del envelope[1::2]
octave_envelopes.append([envelope])
t = np.arange(len(envelope)) / rate
plt.plot(t[:rate],envelope[:rate], label='Harmonico: ' + str(i))
plt.xlabel('Tiempo [seg]')
plt.ylabel('Amplitud')
plt.grid('minor')
plt.legend()
plt.show()
with xlsxwriter.Workbook('OCT' + str(oct) + '.xlsx') as workbook:
worksheet = workbook.add_worksheet()
for col, data in enumerate(octave_envelopes):
dat = data[0]
max_data = max(dat)
dat = [i for i in dat]
worksheet.write_column(0, col, dat)
The parameters that were presented to the GUI user to modify the synthesis model in question were the coefficients of each of the 12 harmonics that make up the signal, that is, they were given the possibility of modifying the amplitudes of its components. spectral, to give the instrument different tones.
Below is the result obtained using the aforementioned method for the synthesis of an acoustic piano note.
It was observed that the syntheses corresponding to the wind instruments result in a quality that is more faithful to the synthesis of the piano.
There are many musical synthesis techniques, which include frequency modulated (FM) synthesis, waveshaping, additive and subtractive synthesis, but to achieve fidelity sounds they require arithmetic speed that can be found in highly processing computers or digital synthesizers, inaccessible to most. This type of synthesis allows its application in a simple way and on most computers on the market. It is not only an algorithm that can be easily applied in software, but also in hardware. Compared to additive synthesis, the sound of the plucked string is much more natural and fluid since in experiments a sine wave generator was needed to produce a similar tone (Sleator 1981).
This method is one of the bases to explain the synthesis through physical models. It consists of repeating a number of samples continuously to generate a pure periodic signal.
The resulting tone is determined by the initial conditions of the recurrence relation. A waveform (sine, triangular, square, etc.) is loaded into a table with a specific sampling frequency
This technique is very simple but musically vague because the tones are purely periodic and, in reality, instruments produce sounds that vary temporally. Without modifications in the table, the harmonic content of the tone is fixed, however in order not to lose the frequency of the specified tone, the period-to-period changes of the samples should be small. To maintain the low level of complexity in the processing we can modify only the sample we are reading.
The modification proposed by Alex Strong involved the modifier being an averager of two successive samples.
The effect caused by this modification involves a smooth decay of the waveform at a rate that differs according to the different harmonics. As a result we obtain a very natural sound, which regardless of the original spectrum, will decay in an almost pure sinusoidal form to a constant that will end in silence. From a hardware point of view, the operations necessary to achieve this method are low complexity and only focus on a binary addition and shift.
Given the recurrence relationship we must preload the table with initial values. It is advisable to use random values at the beginning of each note because they have most of their spectral content in the high acoustic frequencies. As long as the samples are repeated periodically, the randomness will not produce hisses or clicks. The subtle spectral variation is enough to make it seem like it comes from the same instrument but insufficient for the sound to not resonate mechanically.
Without the decay algorithm, a table of random values has equal harmonics up to the nyquist frequency and sounds like an electric organ (reed organ).
To generate the initial conditions we will use double-level random samples.
The root mean square value of the output amplitude
We have to separate the process into two parts. On the first, we have a set of
The second part involves the recursive process where only the output and its delayed values are involved.
The total system is represented as follows.
We analytically obtain the frequency response of the model taking into account the periodicity as a function of the frequency of the synthesized note.
Next, the digital effects applied in the implemented program will be explained and developed and compared with the sound seen in figure \ref{fig:original}. In addition to this, along with the report, the original sound files and the applied effects are attached so that the effect can be better appreciated.
The Delay lines are one of the most important blocks that build several of the most outstanding effects. In general, they are easy to implement, and small changes in their parameters reproduce different effects that will be seen below.
The basic Delay plays an audio signal after a specified time. This time can vary arbitrarily from milliseconds to several seconds. In the figure elow you can see a basic diagram of a delay block.
This delay block can be expressed with the equation seen below.
It is important to emphasize that the delay block is Linear Time Invariant (LTI). On the other hand, it can be easily shown that the Z transform of the delay block is determined by the equation below.
In the figure below you can see how the waveform and the spectrogram of the signal are affected when the effect is applied.
While basic delay is very easy to implement, it has some problems. The main one, being that it only produces a single echo to the output signal. On the other hand, feedback can cause the sound at the output to have a continuous echo, which is what is sought most of the time. A block diagram of the delay with feedback can be seen in figure below.
You can easily arrive at the equation below of the delay with feedback.
Finally, applying the Z Transform to the previous expression, we trivially arrive at:
Vibrato is defined as a quasi-periodic variation in the pitch of a note. Vibrato is characterized by its frequency (how often the pitch of a note changes) and its width (the total variation in the pitch of the note).
The implementation of vibrato is based on a delay block modulated by a low frequency oscillator (LFO). A basic diagram of vibrato can be seen in the figure below.
The expression of vibrato is given by;
Being,
Where
It is important to clarify that for this effect to work it is necessary to have values of
for n<t<n+1
In the figure below you can see how the waveform and the spectrogram of the signal are affected when the effect is applied. You can clearly see how the frequencies do not remain constant, but rather follow the shape of a sine wave very small in frequency.
The flanger effect is based on the principle of constructive and destructive interference. A basic flanger is broadly related to basic delay and can be seen in the equation below.
The delay
In this way, its frequency response will be defined by,
where
In the same way as the feedback delay was explained, the flanger gets its own feedback form broadly related to its delay version. A figure of the flanger with feedback is seen in the figure below.
In the same way as with the delay, its expression is given by,
Being its Z Transform,
Finally, its frequency response will be given by,
where
It is important to clarify that since
In the same way as with vibrato, for flanger (basic or with feedback), an interpolation of the input signal is also used. For this case, the linear interpolation explained in vibrato is also used again.
In the figure below you can see how the waveform and the spectrogram of the signal are affected when the effect is applied.
Although in the previous section we dedicated ourselves to analyzing the effects based on delay, throughout the next section we will focus on the effects based on filters, whether low-pass, high-pass, band-pass, peaking/ notch, shelving and pass-all.
Although there are many effects based on filters, the main and most used is the equalizer. The equalizer is the digital effect that is responsible for "equalizing" (e.g. equalizing) the different frequency bands to compensate for non-ideal behaviors due to equipment and room acoustics. All equalizers used today are filter-based, and most equalizers are composed of multiple sub-filters. It is known that the number of bands to be equalized is arbitrary and chosen by the person who designs the equalizer program, which is why it is necessary to clarify that in order to keep the development of this report limited, we will focus on developing a 3-band equalizer. bands. These will be the low, mid and high tones, and the center frequencies of each band will be calculated in a similar way to how they were done in the Circuit Theory report (e.g. logarithmically). In the figure below you can see a block diagram of a 3-band equalizer with cascade connection.
For the application of a 3-band digital equalizer, 2 shelving type filters and one peaking/notch type filter will be used. For this, the following transfer functions will be used\footnote{It is necessary to highlight that although for practical purposes the analysis of the mathematical obtaining of these transfers will not be carried out, they are fully detailed in chapter 3 of the book \cite{ eff1}.},
As can be seen in the equations below, the formulas They are too bulky and not at all simple. This is why we will proceed to set a few parameters in order to obtain a transfer that is easier to implement and more understandable as well.
The first parameters to set will be the center frequencies of each filter, since our equalizer will have fixed equalization bands. This is why by arbitrarily choosing
The next parameters to set will be the respective bandwidths of each filter. For this, the following bandwidths were chosen:
Finally, it is known that typical tone controls have a gain variation of
Although there are 2 ways to interconnect the filters, for this particular case the filters will be connected in cascade. This is because in general, these types of 3-band filters are usually connected in cascade because they give a better output.
As seen throughout the entire report, the efficiency of algorithms is crucial in the development of this type of systems, and that is why, for these types of purposes, paying special attention to efficiency is no exception. This is why it was decided not to calculate the filter coefficients in each pass, but to calculate them only when the values change.
To implement this type of filters, it is not enough to have the transfer as a function of z, that is why it is anti-transformed into z and the recurrence equations of the filters are obtained. These equations are detailed below, and these will be the ones that will be implemented in the program's effects system.
For the cases of
We obtain,
On the other hand, for
we have,
The results of applying the equalizer in different ways can be seen in the figures below.
To implement an 8-band equalizer, all the previously explained theory was used, with the exception that on this occasion, all the filters were of the Peaking-Notch type, and each one corresponded to the transfer described in the third transfer equation described above.
To determine the central frequencies of each filter, a logarithmic distance between them was used, following the expression below.
To determine the bandwidths of each filter, the expression below was taken arbitrarily.
Finally, to determine the range of gains, an interval of
Luckily for the reader, the Wah-Wah effect uses only 1 digital filter, and because of this, the mathematical calculations to be done are less. Without theoretical justification in this report, we will proceed to give below the transfer equation of a Wah-Wah.
Where,
Therefore, its expression in differences will be determined by,
As seen in the transfer equation described above, the Wah-Wah uses a low-pass equation, however, the special thing about a wah-wah effect is that unlike a common resonant low-pass filter and current, this filter changes depending on the cutoff frequency
The values they will take will be confined to,
In the figure below you can see how the waveform and the spectrogram of the signal are affected when the effect is applied.
For the reverb effect, the flat reverb type was implemented with the transfer given by,
It can be seen that its difference equation is given by,
In the figure below you can see how the waveform and spectrogram of the signal are affected when the effect is applied.
In a similar way to how the Schroeder reverberator was implemented, a simple low-pass filter of the form,
Therefore, the difference equation of the Schroeder reverberator with the low-pass filter is of the form,
In the figure below you can see how the waveform and spectrogram of the signal are affected when the effect is applied.
Next, one of the implemented amplitude modulation effects, the tremolo, will be detailed.
The tremolo is the effect that results from multiplying the input signal by another variable signal in time, therefore its function in time will be given by,
Being,
A block diagram of the system can be seen in the figure below.
In the figure below you can see how applying the tremolo effect affects the spectrogram.
To use the effects mentioned above, a GUI program was implemented separate from the main instrument synthesis program, in order to be able to use the effects with .wav files. The design of the program can be seen in the figure below.
Below we will proceed to detail a small user manual for the use of said program.
To begin applying effects to a file in the program, the file must be loaded into the program. This is done from the top bar of the window, inserting the path and name of the file .wav that you want to load. If everything goes well, a banner saying 'File uploaded successfully' should pop up in the bottom right corner of the screen.
To load the different effects to the song loaded in the program, you must check the box for the effect you want to use and move the effect configuration bars as desired.
Once the file has been loaded into the program, and the effects to be applied have been selected and configured, you must proceed to choose a name for the output file in the lower right bar, and when finished click the 'Save File' button.
As could be seen, the digital effects used can generate different types of sounds. It is important to clarify that, when using IIR filters in some of the filters, the sound output through a speaker could saturate/clip the sound. This is why special care must be taken in the transfer coefficients of the IIR filter.
To develop the program, the C++ language was used due to its ability to carry out tasks more quickly than other languages such as Python. In this work, we focus on creating a program capable of synthesizing notes in real time, and reproducing them with effects also in real time. This is why a program was required that was fast enough to process the amount of data necessary so that the sound did not get stuck at any time.
For the graphical interface, the Open Source Qt library was used, due to its relative ease of use and its extensive development in graphical functions. Some screenshots of the program can be seen in the figures below.
To manage the audio to be reproduced, the PortAudio library was used, since it had functions that were simple enough to understand and easy to apply to be able to reproduce a given sound. Furthermore, with the calls to callbacks that this library provided, it gave us a wide range of possibilities to apply algorithms to the signals that were subsequently going to be reproduced. This is why each synthesis, and in the effects, had to develop the necessary callbacks to be able to be used in protaudio, and following strict programming rules so that the algorithms used do not take too much time. Some of the restrictions used were: a) Do not allocate or allocate memory within the callbacks, b) do not have very high algorithm complexity orders, and c) that the algorithms do not process unnecessary information.
As already discussed in previous sections, the different synthesis methods require previous data on instruments and other parameters, which is why it was decided that for our program, the data loading is done during the beginning of the program. For this reason, although the program may take a few seconds to start, once started it can begin synthesizing any audio without any loading screen or any delay.
For each effect, the user was provided with different presets, along with different dials to adjust the parameters of each effect to taste. In this way, no effect is fixed, but rather it can be modified, along with the synthesis parameters, to generate any type of sound desired, and hear the result in real time.
Please do not hesitate to reach out to me if you find any issue with the code or if you have any questions.
-
Personal email: ianczdiaz@gmail.com
-
LinkedIn Profile: https://www.linkedin.com/in/iancraz/
Music Synthesizer by Ian C. Diaz is licensed under CC BY-NC-SA 4.0
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit , provide a link to the license,
and indicate if changes were made. You may do so in any reasonable manner, but
not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must
distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological
measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the
public domain or where your use is permitted by an applicable exception or
limitation.
No warranties are given. The license may not give you all of the permissions
necessary for your intended use. For example, other rights such as publicity,
privacy, or moral rights may limit how you use the material.