Skip to content

Commit

Permalink
partialtracking:
Browse files Browse the repository at this point in the history
- new: apply scaling to multiple dimensions of a partial
- fix: copying a spectrum should copy also the partial data
- new: apply scaling to multiple dimensions for all partials
- new: synthesize gained a speed parameter
- new: rec method, to synthesize to a soundfile directly

snd.audiosample

- new: new 'show' method for Sample, to call the html display within jupyter
- fix: reduce plotting resolution. This still takes too long within jupyter,
  so reduce the resolution by default. A user can still ask for a specific
  plotting profile
- new: fundamental analysis. This allows a Sample to trigger a fundamental
  analysis as implemented in maelzel.transcribe.mono
- new: playback is by default done via sounddevice instead of csoundengine

snd.deverb

New module implementing dereverberation and sustain removal. Sustain
removal is particularly helpful for f0 tracking since resonances of previous
notes can confuse the algorithm. This is mostly useful for piano and other
percussive sources since it depends on onset detection

snd.filters

New module implementing filters,  in particular spectral filters for
detailed spectral modeling.

- new: spectralFilter, takes a list of paris (frequency, gain) defining
  a bpf over the frequency spectrum.

snd.freqestimate

- fix: better default value for lowAmp

snd.generate

New module, generates simple signals as numpy arrays (white noise,
pink noise, gaussian noise), mostly for testing.

snd.vamptools

- new: generic pyin analysis, does a very thorough f0 analysis using
  the pyin vamp plugin, making as much information available as possible:
  voiced probability curve, f0 candidates over time, smooth pitch curve,
  raw f0 curve, smooth pitch curve masked with nan for unvoiced segments,
  rms curve (this is calculated externally to the plugin and used to
  enrich the analysis when checking if there is any signal at all at any
  given moment), rms histogram, number of f0 candidates over time). All
  this info is packed in a dataclass PyinResult
- fix: correct default low amplitude suppression to a better (lower)
  value

transcribe

- new: when transcribing, breakpoints belonging to a gesture are grouped
  together. Before this was done in a normal python list, now a new BreakpointGroup
  is used instead. A group is defined by a sequence of breakpoints within
  silences or onsets. The idea behind that is to capture both the notion
  of gestalt and the nuances within such a gesture.
- new: renamed FundamentalAnalysisMono to FundamentalAnalysisMonophonic,
  to make clear that the mono is not referring to one channel but one source
- new: add sustain removal for transcription. This helps when doing fundamental
  analysis in contexts where the sustain or resonance of a previous event
  might leak into later events, making the analysis more difficult
- new: an analysis can be played
- new: an analysis can be plotted
  • Loading branch information
gesellkammer committed Apr 19, 2024
1 parent 9fc5999 commit 555bc25
Show file tree
Hide file tree
Showing 36 changed files with 2,663 additions and 1,279 deletions.
8 changes: 4 additions & 4 deletions docs/notebooks/demo-transcribe.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
"\n",
"When transcribing a human voice, which is a monophonic source with highly harmonic timbre for the pitched parts of speech/song, probably the most appropriate transcription method is based on the analysis of the fundamental frequency in combination with onset/offset prediction and other secondary features. \n",
"\n",
"`maelzel.transcribe.FundamentalAnalysisMono` implements the skeleton of such an approach:\n",
"`maelzel.transcribe.FundamentalAnalysisMonophonic` implements the skeleton of such an approach:\n",
"\n",
"1. Onset detection\n",
"2. The fundamental is sampled within each onset-offset timespan to include any pitch inflections. \n",
Expand Down Expand Up @@ -140,7 +140,7 @@
}
],
"source": [
"analysis = transcribe.FundamentalAnalysisMono(s0.samples, \n",
"analysis = transcribe.FundamentalAnalysisMonophonic(s0.samples, \n",
" sr=s0.sr, \n",
" # Quantize the pitch to its nearest 1/8th tone\n",
" semitoneQuantization=4, \n",
Expand Down Expand Up @@ -412,7 +412,7 @@
"</table>"
],
"text/plain": [
"<maelzel.transcribe.mono.FundamentalAnalysisMono at 0x7f0eb42d2750>"
"<maelzel.transcribe.mono.FundamentalAnalysisMonophonic at 0x7f0eb42d2750>"
]
},
"execution_count": 27,
Expand Down Expand Up @@ -775,7 +775,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.11.6"
}
},
"nbformat": 4,
Expand Down
67 changes: 61 additions & 6 deletions maelzel/_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from maelzel.common import F
import functools
import appdirs
import logging


from typing import Callable, Sequence, TYPE_CHECKING
Expand Down Expand Up @@ -98,6 +99,7 @@ def reprObj(obj,
def fuzzymatch(query: str, choices: Sequence[str], limit=5
) -> list[tuple[str, int]]:
"""
Fuzzy matching
Args:
query: query to match
Expand All @@ -114,10 +116,31 @@ def fuzzymatch(query: str, choices: Sequence[str], limit=5
return thefuzz.process.extract(query, choices=choices, limit=limit)


def checkChoice(name: str, s: str, choices: Sequence[str], threshold=8):
def checkChoice(name: str, s: str, choices: Sequence[str], maxSuggestions=12, throw=True, logger: logging.Logger=None
) -> bool:
"""
Check than `name` is one of `choices`
Args:
name: what are we checking, used for error messages
s: the value to check
choices: possible choices
maxSuggestions: possible choices shown when s does not match any
throw: throw an exception if no match
logger: if given, any error will be logged using this logger
Returns:
True if a match was found, False otherwise
"""
if s not in choices:
if len(choices) > threshold:
matches = fuzzymatch(s, choices, limit=20)
if logger:
logger.error(f"Invalud value '{s}' for {name}, possible choices: {sorted(choices)}")

if not throw:
return False

if len(choices) > 8:
matches = fuzzymatch(s, choices, limit=maxSuggestions)
raise ValueError(f'Invalid value "{s}" for {name}, maybe you meant "{matches[0][0]}"? '
f'Other possible choices: {[m[0] for m in matches]}')
else:
Expand Down Expand Up @@ -161,8 +184,8 @@ def showF(f: F, maxdenom=1000) -> str:
"""
if f.denominator > maxdenom:
f2 = f.limit_denominator(maxdenom)
return "*%d/%d" % (f2.numerator, f2.denominator)
num, den = limitDenominator(f.numerator, f.denominator, maxden=maxdenom)
return f"~{num}/{den}"
return "%d/%d" % (f.numerator, f.denominator)


Expand Down Expand Up @@ -412,4 +435,36 @@ def intersectF(u1: F, u2: F, v1: F, v2: F) -> tuple[F, F] | None:
"""
x0 = u1 if u1 > v1 else v1
x1 = u2 if u2 < v2 else v2
return (x0, x1) if x0 < x1 else None
return (x0, x1) if x0 < x1 else None


def limitDenominator(num: int, den: int, maxden: int) -> tuple[int, int]:
"""
Copied from https://github.com/python/cpython/blob/main/Lib/fractions.py
"""
if maxden < 1:
raise ValueError("max_denominator should be at least 1")
if den <= maxden:
return num, den

p0, q0, p1, q1 = 0, 1, 1, 0
n, d = num, den
while True:
a = n // d
q2 = q0 + a * q1
if q2 > maxden:
break
p0, q0, p1, q1 = p1, q1, p0 + a * p1, q2
n, d = d, n - a * d
k = (maxden - q0) // q1

# Determine which of the candidates (p0+k*p1)/(q0+k*q1) and p1/q1 is
# closer to self. The distance between them is 1/(q1*(q0+k*q1)), while
# the distance from p1/q1 to self is d/(q1*self._denominator). So we
# need to compare 2*(q0+k*q1) with self._denominator/d.
if 2 * d * (q0 + k * q1) <= den:
return p1, q1
else:
return p0 + k * p1, q0 + k * q1


7 changes: 5 additions & 2 deletions maelzel/core/_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -468,11 +468,14 @@ def parseNote(s: str, check=True) -> NoteProperties:
if check:
if isinstance(notename, list):
for n in notename:
if n[-1] == '!':
n = n[:-1]
if not pt.is_valid_notename(n, minpitch=0):
raise ValueError(f"Invalid notename '{n}' while parsing '{s}'")
else:
if not pt.is_valid_notename(notename):
raise ValueError(f"Invalid notename '{notename}' while parsing '{s}'")
n = notename if notename[-1] != '!' else notename[:-1]
if not pt.is_valid_notename(n):
raise ValueError(f"Invalid notename '{n}' while parsing '{s}'")
return NoteProperties(notename=notename, dur=dur, keywords=properties,
symbols=symbols, spanners=spanners)

Expand Down
8 changes: 8 additions & 0 deletions maelzel/core/builtinpresets.py
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,14 @@
builtin=True
),

PresetDef(
'.bandnoise',
code=r'''
|kbw=0.9|
aout1 = beosc(kfreq, kbw) * a(kamp)
'''
),

PresetDef(
'.sing', description="Simple vowel singing simulation",
init=r"""
Expand Down
44 changes: 19 additions & 25 deletions maelzel/core/clip.py
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,8 @@ def __init__(self,
if offset is not None:
offset = asF(offset)

super().__init__(offset=offset, dur=dur, label=label)
super().__init__(offset=offset, dur=F0, label=label)
self._calculateDuration()

@property
def sr(self) -> float:
Expand Down Expand Up @@ -294,46 +295,39 @@ def durSecs(self) -> F:
def pitchRange(self) -> tuple[float, float]:
return (self.pitch, self.pitch)

def _durationInBeats(self,
absoffset: F | None = None,
scorestruct: ScoreStruct = None) -> F:
"""
Calculate the duration in beats without considering looping or explicit duration
Args:
scorestruct: the score structure
Returns:
the duration in quarternotes
"""
absoffset = absoffset if absoffset is not None else self.absOffset()
struct = scorestruct or self.scorestruct() or Workspace.getActive().scorestruct
starttime = struct.beatToTime(absoffset)
endbeat = struct.timeToBeat(starttime + self.durSecs())
return endbeat - absoffset

@property
def dur(self) -> F:
"The duration of this Clip, in quarter notes"
if self._explicitDur:
return self._explicitDur

absoffset = self.absOffset()
struct = self.scorestruct() or Workspace.getActive().scorestruct
struct = self.scorestruct(resolve=True)

if self._dur is not None and self._durContext is not None:
if self._dur and self._durContext is not None:
cachedstruct, cachedbeat = self._durContext
if struct is cachedstruct and cachedbeat == absoffset:
return self._dur

dur = self._durationInBeats(absoffset=absoffset, scorestruct=struct)
self._calculateDuration(absoffset=absoffset, struct=struct)
return self._dur

def _calculateDuration(self, absoffset: F = None, struct: ScoreStruct = None
) -> None:
if absoffset is None:
absoffset = self.absOffset()
if struct is None:
struct = self.scorestruct(resolve=True)
dur = struct.beatDelta(absoffset, absoffset + self.durSecs())
self._dur = dur
self._durContext = (struct, absoffset)
return dur

def __repr__(self):
return (f"Clip(source={self.source}, numChannels={self.numChannels}, sr={self.sr}, "
f"dur={self.dur}, sourcedursecs={_util.showT(self.sourceDurSecs)}secs)")
return (f"Clip(source={self.source}, "
f"numChannels={self.numChannels}, "
f"sr={self.sr}, "
f"dur={_util.showT(self.dur)}, "
f"sourcedur={_util.showT(self.sourceDurSecs)}s)")

def _synthEvents(self,
playargs: PlayArgs,
Expand Down
2 changes: 1 addition & 1 deletion maelzel/core/playback.py
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ class _SyncSessionHandler(SessionHandler):
def __init__(self, renderer: SynchronizedContext):
self.renderer = renderer

def sched(self, event: csoundengine.event.Event):
def schedEvent(self, event: csoundengine.event.Event):
return self.renderer._schedSessionEvent(event)


Expand Down
Loading

0 comments on commit 555bc25

Please sign in to comment.