Physical modeling of San Xian

Idea generation, Python-librosa, performing, writing: Yudan Zou (yudanwendyzou@gmail.com) – presented version in all portfolios

Extended: JUCE/C++ and visualization: Chen Ji ((jjjjjc12@gmail.com)

Overview

The SanXian (Chinese: 三弦, literally “three strings”) is a three-stringed traditional Chinese lute. It has a long fretless fingerboard, and the body is traditionally made from snake skin stretched over a rounded rectangular resonator. . Although some of string instruments are modeled in DAW’s synthesis libraries, such as Guzheng, yangqin, sound simulation of San Xian is still missing. Therefore, this project tries to simulate the sound of San Xian as well as provide users the experience of playing San Xian with keyboard interface.

San Xian Performance, with Extended Techniques, Variation on a Hubei Folk Scale

Tools for extended techniques performance. From extended techniques performance, a better understanding regarding San Xian’s Feature can be gained.

Selected Recordings for San Xian Improvisation

Acoustics of San Xian

San-Xian(三弦), as presented in the picture below, is a 3-string fretless pluck instrument of Chinese traditional folk Qu-Yi ensemble, it is also famous for being the origin of Japanese instrument Shamisen (三味线).

My motivation for the project is to rediscover the authentic ‘folk spirit’ of Chinese traditional instrumental music by innovating slight digitalized timbre changes of Sanxian in JUCE.

After the ‘reforming & modernization of MinYue (Chinese Folk Instruments)’(民乐改革) in China’s music academia scene since 1950s, the goal for instruments being ‘more symphonized(民乐交响化)had overshadowed certain down-to-earth, deep-folk spirit of Chinese music, with Sanxian, the most important accompanying instruments for Quyi(北京琴书、苏州评弹、天津时调…etc.)being placed to an insignificant position because of its loudness, roughness, and the difficulty to be compatible with symphonized MinYue Ensemble. In the booming era of China’s Rock n’ Roll music in 90s, Metal Rock Singer He Yong (何勇)’s father He Yu Sheng 何玉生, who was a professional Sanxian performer at Central MinYue Orchestra, performed Sanxian on the stage of Hong Kong Hung Hom Arena in his song’s signature Rock song Bell Drum Tower 钟鼓楼. The short-lived of Chinese Rock music highlight does not conceal the fact that it’s a voice of ‘people’ – the authentic contemporary folk music of China.

For Sanxian not being forgotten, and be endowed with more possibilities, I thus started the project of digitalized Sanxian.

Through active listening,

Free Glissando – fretless

Loud & Folky – Python-Skinned Drum Resonator

Acoustic Characteristics of Sanxian

G-d-g, Length: 122 cm (48 in)

  • Components

top and bottom with snakeskin (python);

 long fretless neck of redwood or other hardwood passes through body terminating in a spatula-shaped peg head.

Three hexagonal elongated pegs;

three silk strings (Metal or Nylon String)

Sound Box

Tuning Pegs

  • Synthesis

Quoted the pioneer of modern San-Xian performing, Prof. Xiao Jian Sheng of China Conservatory of Music, San-Xian generates sounds by plucking via a Three-step process: first, string plucked, vibrates, and therefore responds with low volume of sound; second, the resonant coming from the snake-skin stretched rectangular ‘drum’ surface

  • Spectrogram and Loudness

Methods

  • Physical Modelling

Basic Theory

Karplus-strong String Synthesis

Technical Support in JUCE: Chen Ji (jjjjjc12@gmail.com)

According to the Julius’s Music physical modeling book [1],a very simple string model can be implemented using a single delay line and our simple first-order low pass filter H(z) to model frequency-dependent loss. Different quality string sounds can be created by changing this filter. The following Figure shows this model.

How to simulate the player plucking the string, here comes an algorithm called Karplus-Strong, where the delay-line initial conditions consist of white noise and the H(z)=12+12z−1. Then the transfer function of the whole system can be obtained:

Then we used MATLAB to simulate the string sound:

function y = karplus_strong(f,fs,length)
%freq = fundemental frequency of the string(in hz)
%len = duration of the resulting soundfile(in sample)
%fs = sample rate(in Hz)
%y = vector containing the output samples
  N = fix(fs/f); % the rounded N samples
  x=zeros(1,length*fs);            
  b=[zeros(1,N) 1];
  a=[1 zeros(1,N-1) -.5 -.5];
  zi = rand(1, max( max(size(a)), max(size(b))) -1);
  y=filter(b,a,x,zi); %initial condition of delay lines is white noise
end

Here is the sample note scale generated by the above function:

Analysis

Audio Feature of San Xian’s Timbre

DCMI: A Database of Chinese Instruments. The access was provided by Prof. Zijin Li of China Conservatory of Music.

About DCMI:

https://dlfm.web.ox.ac.uk/sites/default/files/dlfm/documents/media/zijin-et-al-dcmi.pdf

Cadenze

Analysis for Special Techniques of Sanxian 1

Huacai is the most common performing highlights in terms of Sanxian’s speciality. When the player accelerates the tempo on different pitches, effect of Cadenze is created. Huacai the basic component of a Sanxian Song, which makes it essential to analyze its acoustic nature and audio feature for the purpose of understanding’s timbre. Since Cadenze contains multiple pitches, the techniques for analyzing song’s audio feature can be compatible here.

import librosa
import librosa.display
import IPython
import numpy as np
import pandas as pd
import scipy
import matplotlib.pyplot as plt
import seaborn as sns
audio = "/content/drive/MyDrive/T0111sanxian/huacai1.wav"
y,sr=librosa.load(audio)
Playing the audio.
print('Audio Sampling Rate: '+str(sr)+' samples/sec')
print('Total Samples: '+str(np.size(y)))
secs=np.size(y)/sr
print('Audio Length: '+str(secs)+' s')
IPython.display.Audio(audio)
     Audio Sampling Rate: 22050 samples/sec
     Total Samples: 332621
     Audio Length: 15.08485260770975 s
0:15 / 0:15

Feature extraction of Technique 1: Cadenze – beat of the playing

y_harmonic, y_percussive = librosa.effects.hpss(y)
plt.figure(figsize=(15, 5))
librosa.display.waveplot(y_harmonic, sr=sr, alpha=0.25)
librosa.display.waveplot(y_percussive, sr=sr, color='r', alpha=0.5)
   lttitl('H i P i')


plt.title('Harmonic + Percussive')
tempo, beat_frames = librosa.beat.beat_track(y=y_harmonic, sr=sr)
print('Detected Tempo: '+str(tempo)+ ' beats/min')
beat_times = librosa.frames_to_time(beat_frames, sr=sr)
beat_time_diff=np.ediff1d(beat_times)
beat_nums = np.arange(1, np.size(beat_times))
fig, ax = plt.subplots()
fig.set_size_inches(15, 5)
ax.set_ylabel("Time difference (s)")
ax.set_xlabel("Beats")
g=sns.barplot(beat_nums, beat_time_diff, palette="BuGn_d",ax=ax)
g=g.set(xticklabels=[])

Chroma Energy Normalized (CENS) of Techniques 1: Huacai of this very phrase

chroma=librosa.feature.chroma_cens(y=y_harmonic, sr=sr)
plt.figure(figsize=(15, 5))
librosa.display.specshow(chroma,y_axis='chroma', x_axis='time')
plt.colorbar()
     
MFCC
mfccs = librosa.feature.mfcc(y=y_harmonic, sr=sr, n_mfcc=13)
plt.figure(figsize=(15, 5))
librosa.display.specshow(mfccs, x_axis='time')
plt.colorbar()
plt.title('MFCC')

Spectral Centroid, Spectral Contrast & Spectral Rolloff

cent = librosa.feature.spectral_centroid(y=y, sr=sr)
plt.figure(figsize=(15,5))
plt.subplot(1, 1, 1)
plt.semilogy(cent.T, label='Spectral centroid')
plt.ylabel('Hz')
plt.xticks([])
plt.xlim([0, cent.shape[-1]])
plt.legend()
contrast=librosa.feature.spectral_contrast(y=y_harmonic,sr=sr)
plt.figure(figsize=(15,5))
librosa.display.specshow(contrast, x_axis='time')
plt.colorbar()
plt.ylabel('Frequency bands')
plt.title('Spectral contrast')
rolloff = librosa.feature.spectral_rolloff(y=y, sr=sr)
plt.figure(figsize=(15,5))
plt.semilogy(rolloff.T, label='Roll-off frequency')
plt.ylabel('Hz')
plt.xticks([])
plt.xlim([0, rolloff.shape[-1]])
plt.legend()

Feature of Sanxian Cadenze – playing at a fast pace on different pitch – fundelment #1 CENS
chroma_mean=np.mean(chroma,axis=1)
chroma_std=np.std(chroma,axis=1)
#plot the summary
octave=['C','C#','D','D#','E','F','F#','G','G#','A','A#','B']
plt.figure(figsize=(15,5))
plt.title('Mean CENS')
sns.barplot(x=octave,y=chroma_mean)
plt.figure(figsize=(15,5))
plt.title('SD CENS')
sns.barplot(x=octave,y=chroma_std)
#Generate the chroma Dataframe
chroma_df=pd.DataFrame()
for i in range(0,12):
    chroma_df['chroma_mean_'+str(i)]=chroma_mean[i]
for i in range(0,12):
    chroma_df['chroma_std_'+str(i)]=chroma_mean[i]
chroma_df.loc[0]=np.concatenate((chroma_mean,chroma_std),axis=0)
chroma_df

Tremolo

audio = "/content/drive/MyDrive/T0111sanxian/tremolo1.wav"
y,sr=librosa.load(audio)
print('Audio Sampling Rate: '+str(sr)+' samples/sec')
print('Total Samples: '+str(np.size(y)))
secs=np.size(y)/sr
print('Audio Length: '+str(secs)+' s')
IPython.display.Audio(audio)
     Audio Sampling Rate: 22050 samples/sec
     Total Samples: 156884
     Audio Length: 7.114920634920635 s
0:00 / 0:07
y_harmonic, y_percussive = librosa.effects.hpss(y)
plt.figure(figsize=(15, 5))
librosa.display.waveplot(y_harmonic, sr=sr, alpha=0.25)
librosa.display.waveplot(y_percussive, sr=sr, color='r', alpha=0.5)
plt.title('Harmonic + Percussive')
tempo, beat_frames = librosa.beat.beat_track(y=y_harmonic, sr=sr)
print('Detected Tempo: '+str(tempo)+ ' beats/min')
beat_times = librosa.frames_to_time(beat_frames, sr=sr)
beat_time_diff=np.ediff1d(beat_times)
beat_nums = np.arange(1, np.size(beat_times))
fig, ax = plt.subplots()
fig.set_size_inches(15, 5)
ax.set_ylabel("Time difference (s)")
ax.set_xlabel("Beats")
g=sns.barplot(beat_nums, beat_time_diff, palette="BuGn_d",ax=ax)
g=g.set(xticklabels=[])

C++ Implementation

JUCE Framework

After simulating KS string in MATLAB, JUCE framework was used to build the real-time sanxian application by using C++.

In order to analyze the timbre of the SanXian, I used Librosa (audio and music processing in Python) to analyze the timbre of the SanXian. We obtained the spectrum of three open strings, which are shown below (from top to bottom is open string1 to open string 3):

Open String 1: natural decay - Time Domain
audio = "/content/drive/MyDrive/T0111sanxian/Open String 1.wav"
y,sr=librosa.load(audio)
print('Audio Sampling Rate: '+str(sr)+' samples/sec')
print('Total Samples: '+str(np.size(y)))
secs=np.size(y)/sr
print('Audio Length: '+str(secs)+' s')
IPython.display.Audio(audio)
     Audio Sampling Rate: 22050 samples/sec
     Total Samples: 92111
     Audio Length: 4.177369614512472 s
0:04 / 0:04
x, sr = librosa.load('/content/drive/MyDrive/T0111sanxian/Open String 1.wav')
ipd.Audio(x, rate=sr)
0:04 / 0:04
plt.figure(figsize=(15, 5))
librosa.display.waveplot(x, sr, alpha=0.8)

hop_length = 512
n_fft = 2048
X = librosa.stft(x, n_fft=n_fft, hop_length=hop_length)
float(hop_length)/sr
float(n_fft)/sr
X.shape
S = librosa.amplitude_to_db(abs(X))
plt.figure(figsize=(15, 5))
librosa.display.specshow(S, sr=sr, hop_length=hop_length, x_axis='time', y_axis='line
plt.colorbar(format='%+2.0f dB')

hop_length = 256
S = librosa.feature.melspectrogram(x, sr=sr, n_fft=4096, hop_length=hop_length)
logS = librosa.power_to_db(abs(S))
plt.figure(figsize=(15, 5))
librosa.display.specshow(logS, sr=sr, hop_length=hop_length, x_axis='time', y_axis='m
plt.colorbar(format='%+2.0f dB')

fmin = librosa.midi_to_hz(36)
C = librosa.cqt(x, sr=sr, fmin=fmin, n_bins=72)
logC = librosa.amplitude_to_db(abs(C))
plt.figure(figsize=(15, 5))
librosa.display.specshow(logC, sr=sr, x_axis='time', y_axis='cqt_note', fmin=fmin, cm
plt.colorbar(format='%+2.0f dB')


# The other strings follow the same code pattern

While reading the relevant materials of how sanxian works, and how the string tunes, we set open string 1 – open string 2 – open string 3 as F – d – g tuning. Then we used this information to set the frequency of each string.

You can download the source code and run the project.

Future Plans

Since we didn’t figure out how to simulate the different tones when plucking the different positions of the string, we will continue working on the project to build more fun experience of playing SanXian.