NeuralNote

Recorded: May 24, 2026, 1:59 a.m.

Original

Summarized

GitHub - DamRsn/NeuralNote: Audio Plugin for Audio to MIDI transcription using deep learning. · GitHub

Navigation Menu

Toggle navigation

Appearance settings

PlatformAI CODE CREATIONGitHub CopilotWrite better code with AIGitHub SparkBuild and deploy intelligent appsGitHub ModelsManage and compare promptsMCP RegistryNewIntegrate external toolsDEVELOPER WORKFLOWSActionsAutomate any workflowCodespacesInstant dev environmentsIssuesPlan and track workCode ReviewManage code changesAPPLICATION SECURITYGitHub Advanced SecurityFind and fix vulnerabilitiesCode securitySecure your code as you buildSecret protectionStop leaks before they startEXPLOREWhy GitHubDocumentationBlogChangelogMarketplaceView all featuresSolutionsBY COMPANY SIZEEnterprisesSmall and medium teamsStartupsNonprofitsBY USE CASEApp ModernizationDevSecOpsDevOpsCI/CDView all use casesBY INDUSTRYHealthcareFinancial servicesManufacturingGovernmentView all industriesView all solutionsResourcesEXPLORE BY TOPICAISoftware DevelopmentDevOpsSecurityView all topicsEXPLORE BY TYPECustomer storiesEvents & webinarsEbooks & reportsBusiness insightsGitHub SkillsSUPPORT & SERVICESDocumentationCustomer supportCommunity forumTrust centerPartnersView all resourcesOpen SourceCOMMUNITYGitHub SponsorsFund open source developersPROGRAMSSecurity LabMaintainer CommunityAcceleratorGitHub StarsArchive ProgramREPOSITORIESTopicsTrendingCollectionsEnterpriseENTERPRISE SOLUTIONSEnterprise platformAI-powered developer platformAVAILABLE ADD-ONSGitHub Advanced SecurityEnterprise-grade security featuresCopilot for BusinessEnterprise-grade AI featuresPremium SupportEnterprise-grade 24/7 supportPricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

DamRsn

/

NeuralNote

Public

Notifications
You must be signed in to change notification settings

Fork
171

Star
2.6k

Code

Issues
19

Pull requests
2

Actions

Projects

Security and quality
0

Insights

Additional navigation options

Code

Issues

Pull requests

Actions

Projects

Security and quality

Insights

DamRsn/NeuralNote

masterBranchesTagsGo to fileCodeOpen more actions menuFolders and filesNameNameLast commit messageLast commit dateLatest commit History204 Commits204 CommitsInstallersInstallers LibLib NeuralNoteNeuralNote TestsTests ThirdPartyThirdParty .gitignore.gitignore .gitmodules.gitmodules CMakeLists.txtCMakeLists.txt LICENSELICENSE NeuralNote_UI.pngNeuralNote_UI.png PACKAGING.mdPACKAGING.md README.mdREADME.md _clang-format_clang-format build.batbuild.bat build.shbuild.sh entitlements.plistentitlements.plist sign_and_package_neuralnote_macos.shsign_and_package_neuralnote_macos.sh View all filesRepository files navigationREADMEApache-2.0 licenseNeuralNote
NeuralNote is the audio plugin that brings state-of-the-art Audio to MIDI conversion into
your favorite Digital Audio Workstation.

Works with any tonal instrument (voice included)
Supports polyphonic transcription
Supports pitch bend detection
Lightweight and very fast transcription
Allows to adjust the parameters while listening to the transcription
Allows to scale and time quantize transcribed MIDI directly in the plugin

Install NeuralNote
Download the latest release for your platform here (Windows, macOS (
Universal) and Linux supported)!
Installers are available for both Windows and Mac, including Standalone, VST3, and AU (Mac only) versions. The
installers allow users to select which format(s) they want to install. On macOS, the code is signed, while on Windows,
it is not. This means you may need to take a few additional steps to use NeuralNote on Windows.
For Linux, raw binaries are provided for VST3 and Standalone. You can install them by copying the files to the
appropriate locations.
Usage

NeuralNote comes as a simple AudioFX plugin (VST3/AU/Standalone app) to be applied on the track to transcribe.
The workflow is very simple:

Gather some audio

Click record. Works when recording for real or when playing the track in a DAW.
Or drop an audio file on the plugin. (.wav, .aiff, .flac, .mp3 and .ogg (vorbis) supported)

The MIDI transcription instantly appears in the piano roll section.
Listen to the result by clicking the play button.

Play with the different settings to adjust the transcription, even while listening to it
Individually adjust the level of the source audio and of the synthesized transcription

Once you're satisfied, export the MIDI transcription with a simple drag and drop from the plugin to a MIDI track.

Watch our presentation video for the Neural Audio Plugin
competition here.
NeuralNote uses internally the model from Spotify's basic-pitch. See
their blogpost
and paper for more information. In NeuralNote, basic-pitch is run
using RTNeural for the CNN part
and ONNXRuntime for the feature part (Constant-Q transform calculation +
Harmonic Stacking).
As part of this project, we contributed to RTNeural to add 2D
convolution support.
Build from source
Requirements are: git, cmake, and your OS's preferred compiler suite.
Use this when cloning:
git clone --recurse-submodules --shallow-submodules https://github.com/DamRsn/NeuralNote

The following OS-specific build scripts have to be executed at least once before being able to use the project as a
normal CMake project. The script downloads onnxruntime static library (that we created
with ort-builder) before calling CMake.
macOS
$ ./build.sh

Windows
Due to a known issue, if you're not using Visual Studio 2022 (MSVC
version: 19.35.x, check cl output), then you'll need to manually build onnxruntime.lib like so:

Ensure you have Python installed; if not, download at https://www.python.org/downloads/windows/ (this does not
currently work with Python 3.11, prefer Python 3.10).

Execute each of the following lines in a command prompt:

git clone --depth 1 --recurse-submodules --shallow-submodules https://github.com/tiborvass/libonnxruntime-neuralnote ThirdParty\onnxruntime
cd ThirdParty\onnxruntime
python3 -m venv venv
.\venv\Scripts\activate.bat
pip install -r requirements.txt
.\convert-model-to-ort.bat model.onnx
.\build-win.bat model.required_operators_and_types.with_runtime_opt.config
copy model.with_runtime_opt.ort ..\..\Lib\ModelData\features_model.ort
cd ..\..

Now you can get back to building NeuralNote as follows:
> .\build.bat

IDEs
Once the build script has been executed at least once, you can load this project in your favorite IDE
(CLion/Visual Studio/VSCode/etc) and click 'build' for one of the targets.
Reuse code from NeuralNote’s transcription engine
All the code to perform the transcription is in Lib/Model and all the model weights are in Lib/ModelData/. Feel free
to use only this part of the code in your own project! We'll try to isolate it more from the rest of the repo in the
future and make it a library.
The code to generate the files in Lib/ModelData/ is not currently available as it required a lot of manual operations.
But here's a description of the process we followed to create those files:

features_model.onnx was generated by converting a keras model containing only the CQT + Harmonic Stacking part of
the full basic-pitch graph using tf2onnx (with manually added weights for batch normalization).
the .json files containing the weights of the basic-pitch cnn were generated from the tensorflow-js model available
in the basic-pitch-ts repository, then converted to onnx with tf2onnx.
Finally, the weights were gathered manually to .npy thanks to Netron and finally applied to a
split keras model created with basic-pitch code.

The original basic-pitch CNN was split in 4 sequential models wired together, so they can be run with RTNeural.
Bug reports and feature requests
If you have any request/suggestion concerning the plugin or encounter a bug, please file a GitHub issue.
Contributing
Contributions are most welcome! If you want to add some features to the plugin or simply improve the documentation,
please open a PR!
License
NeuralNote software and code is published under the Apache-2.0 license. See the license file.
Third Party libraries used and license
Here's a list of all the third party libraries used in NeuralNote and the license under which they are used.

JUCE (JUCE Starter)
RTNeural (BSD-3-Clause license)
ONNXRuntime (MIT License)
ort-builder (MIT License)
basic-pitch (Apache-2.0 license)
basic-pitch-ts (Apache-2.0 license)
minimp3 (CC0-1.0 license)

Could NeuralNote transcribe audio in real-time?
Unfortunately no and this for a few reasons:

Basic Pitch uses the Constant-Q transform (CQT) as input feature. The CQT requires really long audio chunks (> 1s) to
get amplitudes for the lowest frequency bins. This makes the latency too high to have real-time transcription.
The basic pitch CNN has an additional latency of approximately 120ms.
The note events creation algorithm processes the posteriorgrams backward (from future to past) and is hence
non-causal.

But if you have ideas please share!
Credits
NeuralNote was developed by Damien Ronssin and Tibor Vass.
The plugin user interface was designed by Perrine Morel.
Contributors
Many thanks to the contributors!

jatinchowdhury18: File browser.
trirpi

More scale options in SCALE QUANTIZE.
Horizontal zoom for the audio waveform and the piano roll.

polygon and SamuMazzi: Linux support.

About

Audio Plugin for Audio to MIDI transcription using deep learning.

Topics

audio

machine-learning

midi

vst

audio-plugin

juce-framework

Resources

Readme

License

Apache-2.0 license

Uh oh!

There was an error while loading. Please reload this page.

Activity
Stars

2.6k
stars
Watchers

60
watching
Forks

171
forks

Report repository

Releases
4

v1.1.0

Latest

Jan 11, 2025

+ 3 releases

Packages
0

Uh oh!

There was an error while loading. Please reload this page.

Contributors

Uh oh!

There was an error while loading. Please reload this page.

Languages

C++
93.3%

CMake
2.6%

Shell
1.5%

Python
1.4%

Batchfile
0.5%

Inno Setup
0.4%

C
0.3%

Generated from eyalamirmusic/JUCECmakeRepoPrototype

Footer

Footer navigation

Terms

Privacy

Security

Status

Community

Docs

Contact

Manage cookies

Do not share my personal information

You can’t perform that action at this time.

NeuralNote is an audio plugin designed to perform state-of-the-art Audio to MIDI transcription within Digital Audio Workstations. The core functionality allows the conversion of various tonal instrument audio, including voice, into MIDI data, supporting polyphonic transcription and pitch bend detection. The plugin is characterized by its lightweight and fast transcription capabilities, which allow users to adjust transcription parameters while listening to the result, as well as the ability to directly scale and time quantize the transcribed MIDI. The user workflow is straightforward: gather audio, initiate transcription, view the resulting MIDI in the piano roll, listen to the transcription, fine-tune settings, and finally export the MIDI to a MIDI track.

The technical foundation of NeuralNote relies on a deep learning model sourced from Spotify's basic-pitch implementation. Specifically, the system utilizes RTNeural for the Convolutional Neural Network (CNN) component and ONNXRuntime for the feature extraction, which involves calculating the Constant-Q transform and Harmonic Stacking. The developers contributed to RTNeural by incorporating two-dimensional convolution support into the architecture. The underlying transcription engine code, including the model weights, is maintained within the repository structure, allowing other developers to potentially reuse the transcription mechanism.

Regarding performance, the system is not capable of real-time transcription. This limitation stems from several factors, primarily because the Constant-Q transform, used as input for the basic pitch model, necessitates relatively long audio segments (greater than one second) to accurately determine the amplitude of the lowest frequency bins, leading to high latency. Furthermore, the basic pitch CNN introduces an additional latency of approximately one hundred twenty milliseconds. The note event creation algorithm is non-causal because it processes posteriorgrams backward from the future to the past.

The project is open source under the Apache-2.0 license, and the source code and associated assets are distributed with platform-specific installation instructions for Windows, macOS, and Linux. Building the project from source requires dependencies such as git and cmake, and necessitates executing platform-specific build scripts which handle the necessary setup for libraries like onnxruntime. The project development is credited to Damien Ronssin and Tibor Vass, with Perrine Morel recognized for the user interface design. The various third-party libraries leveraged in the project include JUCE, RTNeural, ONNXRuntime, ort-builder, basic-pitch, and minimp3.