Wolfram Audio | Things to Try

Make edits and run any piece of code by clicking inside the code and pressing

+

Audio Processing & Analysis. Wolfram Audio offers highly optimized processing and high-level analysis of speech, music and other audio signals. Tight integration with machine learning and neural networks enables solutions in automated systems, security, medicine and more.

Import, Process and Export Audio

Import an audio file with high-frequency background noise:

Run

In[]:=

music=Import["ExampleData/sample.flac"]

Show the audio spectrogram:

Run

In[]:=

Spectrogram[music]

Remove the high-frequency background noise:

Run

In[]:=

filteredmusic=LowpassFilter[music,

800

]

See the spectrogram of the filtered audio:

Run

In[]:=

Spectrogram[filteredmusic]

Export the result in the format of your choice:

Run

In[]:=

Export["filteredexample.mp3",filteredmusic]

Visualize Audio Signals

Audio from different types of sources:

Run

In[]:=

sources=ExampleData[{"Audio",#}]&/@{"Cat","Cello","FemaleVoice"}

Audio waveforms show audio amplitude over time:

Run

In[]:=

AudioPlot[#,PlotLayout->"Averaged"]&/@sources

Spectrograms show frequency over time:

Run

In[]:=

Spectrogram/@sources

Periodograms show dominant frequencies:

Run

In[]:=

Periodogram[#,256]&/@sources

Remove Noise or Add Effects

Start with a noisy audio clip:

Run

In[]:=

apollo=ExampleData[{"Audio","Apollo11SmallStep"}]

Denoise the audio:

Run

In[]:=

filteredapollo=WienerFilter[apollo,30]

Perform a pitch shifting on the first half of the audio:

Run

In[]:=

{beginning,end}=AudioSplit[filteredapollo,4];AudioJoinAudioPitchShiftbeginning,

octaves

,end

Extract Features from Audio

Begin with a dataset of spoken digits:

Run

In[]:=

digitspeech=

Input	SpeakerID	Output
	Speaker A	1
	Speaker A	2
	Speaker A	3
	Speaker B	1
	Speaker B	2
	Speaker B	3
	Speaker C	1
	Speaker C	2
	Speaker C	3

;

Define a feature extractor using a pre-trained model:

Run

In[]:=

extractor=NetAppend[NetTake[NetModel["Wav2Vec2 Trained on LibriSpeech Data"],"FeatureExtractor"],"Mean"->AggregationLayer[Mean,1]]

Show extracted features in a 3D plot:

Run

In[]:=

Module{colors,styling},colors="Speaker A"->

,"Speaker B"->

,"Speaker C"->

;styling=

Function[

]

;Legended[FeatureSpacePlot3D[Normal[digitspeech[All,styling]],FeatureExtractor->extractor,LabelingFunction->None],PointLegend[Values[colors],Keys[colors]]]

Identify Audio Sources

Identify what’s in an audio signal:

Run

In[]:=

AudioIdentify



Begin with audio from mixed sources:

Run

In[]:=

mixedsources=

;

Identify the dominant source by half-second segments:

Run

In[]:=

segmentIds=AudioBlockMapAudioIdentify

&,mixedsources,{1,.5}//Normal

Assemble the segments into intervals:

Run

In[]:=

intervals=GroupBysegmentIds,Last,

Function[

]



Plot the results:

Run

In[]:=

LegendedAudioPlotmixedsources,



,SwatchLegend

