@giuliomoro We have a Magnetic Resonator Piano equipped with optical reflectance sensors which is currently making use of a desktop running Mac OS applications: a legacy version of the TouchKeys software which generates OSC messages, and a specialized MRP application which parses the OSC and then generates audio and the necessary MIDI messages for use by the hardware which routes the signals to their corresponding resonators.

We are attempting to design an embedded system for control of the instrument, in order to eliminate the need for a desktop computer. We have a Bela, and have managed to use a MIDI controller with Bela and Pure Data to function as a polyphonic synthesizer which generates the audio and MIDI messages needed to produce sound on the instrument.

Our next hurdle is determining how to parse the data coming from the serial connection from the key sensor board. In the current implementation, it seems that the key sensor board is treated as a TouchKeys device, and the key position is simply interpreted as the X position of a TouchKey.

My question is, do you think we can use Bela to parse the data coming from a TouchKeys device directly from that device, without first going through the TouchKeys software?

I noticed that your github repository at this address: https://github.com/giuliomoro/bela-modular-touchkeys but this appears to just parse already generated MIDI, rather than the actual serial data. Any more information on this repo?

I am currently working on reverse engineering the source code for the legacy version of TouchKeys which is currently responsible for the parsing, but this is slow going, as you could imagine.

I would really appreciate hearing any thoughts you have, or anyone else has, on this matter, and if you had any resources you could point me in the direction of which might facilitate this projects development. Thanks in advance!

    wa3573 We have a Magnetic Resonator Piano

    I take it you are based at Drexel university?

    wa3573 Any more information on this repo?

    that one is not very relevant, as you noticed. This one, however, is VERY relevant. In this case I use Bela as the SPI master for the boards on the scanner you use, so I can get the sensor reading, calibrate them, and pass them on to a program running on Bela. I use a simple breakout capelet I designed which comes with the same connector as the scanner boards. This setup comes with a caveat, though: it currently only works with 3 boards. If you want to use all of the four boards, you would need to jumper an extra wire to the second board. I think this way you will be able to bring the data in with more accuracy and less overhead than over USB/UART, but I have not actually compared the two solutions.

    From looking at the Touchkeys software with the MRP extensions, I see that it may not be extremely suitable for running on the embedded device, as it's overly complicated. The best option would probably be to strip out the pieces you need (i.e.: the feature extraction from the key position) and run those on Bela.

    I am currently using the above spi-pru code to calibrate data I send to the Touchkey(+MRP) software, just to use the feature extraction that comes with it. If I find that it works well enough, I may be stripping it out at some point in the near future to run it on Bela.

    We have a student here who may start working on porting the MRP to Bela in the next couple of weeks, however, they will not be done any time soon (their deadline would be August 2019).

      giuliomoro

      Giulio,

      Thanks so much for the prompt and extremely helpful reply! Indeed, we are at Drexel.

      I've been reading over your repository and getting up to speed with your implementation. I don't experience with PRU's so it may take some time for me to understand everything thoroughly. Nonetheless, you have doubtless saved us a bunch of time with this.

      Forgive my excessive questioning, and feel free to contribute as little or as much as you wish! Regarding your hardware setup with the breakout board, let me see if I can get this straight. The breakout board you designed uses the GPIO pins as an interface for the serial connection to the sensor boards? And you have the sensor boards directly connected to this breakout board via the terminating ribbon cable? What is powered/communicated by the extra jumper wire? I would be very interested in this implementation, as it does make sense that this would reduce overhead. At the very least, if I understand correctly, it would eliminate the intermediary USB interface currently in use. Less complexity is good for our purposes.

      On that note, what would change if we chose to go with the USB/UART route? I would think the PRU implementation would be rewritten to use UART instead of GPIO, but I'm not sure what else would change. The reason I ask is we may go that route rather than manufacture the break out board, although that seems like it would be straightforward enough.

      I agree with you entirely that the complexity of the Touchkeys software is overly complicated for the application, and was planning on stripping out the functionality you mentioned.

      Glad to hear that we're not the only group who is working on this, it's an interesting problem. Thanks again for all your help!

        wa3573 The breakout board you designed uses the GPIO pins as an interface for the serial connection to the sensor boards?

        yes

        wa3573 And you have the sensor boards directly connected to this breakout board via the terminating ribbon cable?

        yes, that is why you can currently only use 3 boards

        wa3573 At the very least, if I understand correctly, it would eliminate the intermediary USB interface currently in use.

        yes.

        This way, I could write a real-time SPI driver on the PRU. Using USB would not allow to have a real-time driver. It may be much easier to actually use the UART over USB, but its performance will depend on system load (i.e.: if too much CPU is used for audio, the UART may be choking). This may be good enough, not sure. Requires testing!

        I have some code here that runs on Bela: it opens and configures a serial port and it sends TouchKeys message data over it. In my use case, this acts as the "device": as far as the TouchKeys software on the host is concerned, my software on Bela emulates the behaviour of the piano scanner. In your case, you want the software on Bela to act as the "host" and communicate to the "device" (the real piano scanner). This code is still a good starting point because it shows the message syntax and how to do serial communication. When you plug the USB of the piano scanner to Bela, you should be able to see it show up as a serial interface in /dev/.
        https://github.com/giuliomoro/serial-piano-scanner

        Awesome! Thanks again for all the help, this is all extremely useful. I'm going to study all the code you've provided and let you know what I come up with.

        Also, watched your talk you included in this thread: https://forum.bela.io/d/409-threads-best-practice/2

        Very informative and well explained!

        a month later

        So, with the help of all your code (thanks again!) I have put together a simple program to test parsing the information from the USB interface to the scanner boards. This is based heavily on your serial-piano-scanner repo. Here's a link if you want to take a look at what I've got so far:
        https://github.com/wa3573/Drexel_MRP_Key_Scanner

        We will be testing it today and will see what happens. Assuming we can get this to work in some capacity, the next step will be integrating this program with our existing Pd patch, which is responsible for the audio signal and MIDI generation. My plan so far would be to base the integration on the default libpd render file located here:
        https://github.com/BelaPlatform/Bela/blob/master/core/default_libpd_render.cpp#L384-L385

        And create a thread which would parse the key sensor data, convert it to MIDI, and then create and send the noteOn messages via Bela's built-in function calls.

        It seems like it would be better to have Pd only responsible for the DSP aspect of things, and therefore simplify it from
        [C++] -> [MIDI] -> [Pure Data] -> [MIDI + Audio]
        to
        [C++]->[MIDI]
        [C++]->[Pure Data]->[Audio]

        But I'm not sure if that would decrease overhead, and whether it might introduce synchronization issues.

        So my questions would be:
        - Is there a more efficient protocol for communicating noteOn style messages to Pd than MIDI?
        - Do you think there would be a benefit to removing the intermediary MIDI step in the chain of communication between C++ and Pd?
        - What do you think of the overall plan so far? Does it sound feasible?

        Thanks for all your help!

        Edit: Okay, completely redesigned things and Implemented threads using the pthread library, with hopes that I can use the Xenomai POSIX wrapper when in practice. Also implemented a circular buffer, Feel free to take a look.

          8 days later

          I see you are using KeyPositionTracker.cpp. It turns out that there was a problem with missing_value() that made the state machine in there fail sometimes. I fixed it upstream: https://github.com/giuliomoro/touch-keyboard-juce/commit/c4041ba585d70f6123658af8dcc10beb2fdbfa9c and also added the file to my serial-piano-scanner repo).

          wa3573 - Do you think there would be a benefit to removing the intermediary MIDI step in the chain of communication between C++ and Pd?

          I would recommend you avoid running two separate processes, one for the audio/PD and one for the USB/serial. If you run them both in the same process (that is: a Bela program with a custom libpd render.cpp which in an auxiliary task does the USB/serial I/O), then you can either decide to send messages into Pd (with libpd_float() or libpd_start_message()...libpd_add_float()....libpd_finish_message() or similar), or calling the "MIDI" callbacks (libpd_noteon etc). I don't think you would find much performance difference between sending messages or calling the libpd_ "MIDI" callbacks: the latter, I assume, are just a wrapper for regular messages, to make them come out of specific objects ([notein] etc).

          wa3573 But I'm not sure if that would decrease overhead, and whether it might introduce synchronization issues.

          You surely need to make any call to Pd from within the audio thread. So you cannot simply call, e.g.: libpd_noteon() from the serial thread. You should instead queue your messages there and then in the audio thread you read them from the queue and call anything you want Pd-related.

          wa3573 [C++] -> [MIDI] -> [Pure Data] -> [MIDI + Audio]

          Not sure where the MIDI output from Pure Data would be going to?

            6 days later

            Awesome, thanks for the reply. I have actually managed to port almost the entire TouchKeys source (minus the GUI, of course) to the Bela, and managed to build it successfully. I created some adapter classes for the necessary JUCE resources. I actually stuck with boost for the circular_buffer, even though I had already implemented one, if it's not broken, why fix it? I will be testing the code this coming week and gradually stripping out the parts that are unnecessary and causing bloat, to try to optimize the use of resources and cut down on CPU load. Check out the repo I posted above, if you're interested.

            Thanks for the tip about missing_value(). It's just the Types.h file, correct? I'll incorporate that into my build.

            giuliomoro If you run them both in the same process (that is: a Bela program with a custom libpd render.cpp which in an auxiliary task does the USB/serial I/O), then you can either decide to send messages into Pd (with libpd_float() or libpd_start_message()...libpd_add_float()....libpd_finish_message() or similar)

            Indeed, I was planning on adapting the available render.cpp to run auxiliary tasks, rather than have separate processes. On that note, I looked at the source for Bela's auxiliary tasks, and it seems they just utilize the pthread library. Since I used the pthread library to retrofit the TouchKeys source code, I assume I could just use the Xenomai wraps as is done here: https://github.com/BelaPlatform/Bela/blob/master/core/AuxiliaryTasks.cpp in order to make everything function with the Bela core? Or would it be better to just use Bella's auxiliary tasks, so that it can take care of prioritizing their operation?

            libpd_start_message() is helpful, thanks. That will give us some more modularity, in case we want to send something other than what the standard MIDI byte encapsulates.

            giuliomoro You should instead queue your messages there and then in the audio thread you read them from the queue and call anything you want Pd-related.

            Great, makes sense.

            giuliomoro Not sure where the MIDI output from Pure Data would be going to?

            I suppose I should have included that in the ASCII diagram. The MIDI output is for the MRP hardware, it uses MIDI to know how to route the audio signal to the pre-actuator amplifier boards.

              wa3573 I actually stuck with boost for the circular_buffer, even though I had already implemented one, if it's not broken, why fix it?

              A good reason would be to avoid the dependency on boost, especially if that is the only component you are using from the whole library. Looking at the API of boost::circular_buffer, it should be fairly easy to replicate with std::vector and an extra index.

              wa3573 cut down on CPU load

              One place where I think the original TK software is pretty inefficient is in the fact that a new mapping object is created every time the key position exceeds a given threshold. I'd suggest to pre-allocate all the objects and only start calling Trigger() when they key position exceeds a given threshold, until the key press is completed.

              wa3573 Since I used the pthread library to retrofit the TouchKeys source code

              How many threads do you have currently?

              wa3573 Since I used the pthread library to retrofit the TouchKeys source code, I assume I could just use the Xenomai wraps as is done here

              The use of plain or __wrap() threads really depends on what each thread is doing. What I normally do, as a general rule, is:
              - a thread with real-time requirements, and which is real-time-safe, should be created as a Xenomai thread (using __wrap_pthread... functions)
              - a thread that is not real-time safe (e.g.: it accesses Linux drivers, does disk or network or USB I/O) should be created as a regular thread (using pthread_... functions).
              However, I am not 100% positive that this is the best approach in all cases.

              Things become more complicated when using synchronization primitives: any Xenomai thread should only use mutexes, condition variables, semaphores, message queues, that are themselves managed by Xenomai (i.e.: created, modified and accessed with the __wrap_ functions). Failing to do so would cause a "mode switch", that is the thread would temporarily turn into a Linux thread, thus losing real-time guarantees. You definitely don't want to do this for threads that have real-time requirements. On the other hand, non-Xenomai threads cannot access synchronization primitives that were created by Xenomai.
              There is only one synchronization primitive, that also works as message-passing, that would allow Xenomai threads to communicate with non-Xenomai threads (even across-processes) without causing a mode switch in the Xenomai thread: XDDP (cross-domain datagram protocol). This is what we use e.g.: in the Midi and AuxTaskNonRT classes. Check out the Xenomai XDDP examples for some more details.

              So, I think (but I didn't benchmark it) that XDDP is the fastest way of communicating between a RT and a non-RT task, where the former is a Xenomai task and the latter is not. However, it make the code less portable, by using this custom protocol.
              As an alternative, if there are one real-time and one non-real-time threads that need to share resources, I sometimes use an unorthodox approach: make them both Xenomai threads. You can keep the priority of the non-real-time thread to 0, and it this thread will switch automatically from primary ("Xenomai RT-safe") and secondary ("Linux") mode whenever needed. I.e.: a thread that does serial I/O but also shares a mutex with the audio thread would switch to primary mode when it tries to get the mutex, and would switch back to secondary mode when doing I/O. This comes with a performance penalty: every time the thread switches mode, it wastes some CPU cycles. I am not sure HOW MANY, I think you lose about 20-40 microseconds for each mode switch.

              If you are planning on making all of your threads Xenomai threads, and all of the pthread_ calls Xenomai calls, then you actually ... could avoid that altogether: there are some compiler flags provided by Xenomai that will just turn all of your pthread_ (and some more) calls into the equivalent Xenomai call. See the documentation here (section: Under the hood: the --wrap flag). The Bela Makefile currently deliberately removes these linker flags (see the Makefile source where it assigns DEFAULT_XENOMAI_LDFLAGS). You will have to re-add them for your build.

              To sum up, I would encourage you to implement all you need using the regular pthread_, then compile and link your application adding the appropriate LDFLAGS. Try to run it this way and see if it all works fine. If you have one thread that switches mode very often (you can monitor the MSW column in the threads stats in /proc/xenomai/sched/stat while the program is running), so much that it becomes a performance issue, then you may want to consider moving that (those) thread(s) to a regular Linux thread (calling __real_pthread_...), and use XDDP to communicate with it from a RT thread (if needed).

              Hope this helps.

                giuliomoro A good reason would be to avoid the dependency on boost, especially if that is the only component you are using from the whole library. Looking at the API of boost::circular_buffer, it should be fairly easy to replicate with std::vector and an extra index.

                I have actually built it successfully with the boost version of the buffer. It's a header-only include for that particular object. There are also some other uses of boost like bind() and prior(), but those have std:: library counterparts which could replace them. I simply felt that I could count on the reliability of boost more so than my own implementation, in this case. Either way, I do still have my implementation (you can see it here, if you wish https://github.com/wa3573/Drexel_MRP_Key_Scanner/blob/157246dea25403c95c2b11f91594e68a77b3a80c/serial-piano-scanner/Utility/circular_buffer.h), if I does seem that boost is causing problems.

                giuliomoro I'd suggest to pre-allocate all the objects and only start calling Trigger() when they key position exceeds a given threshold, until the key press is completed.

                Indeed, good call.

                giuliomoro How many threads do you have currently?

                Currently there is the main program controller thread, an IO thread (which handles USB communication), a RGB LED control thread, a Mapping Scheduler thread and a future events Scheduler. The LED control thread will be disabled until we can verify the functionality of the rest of the program, and may not ever be enabled at all, depending on performance. The scheduling threads may turn out to be unnecessary, and it would be ideal to remove them.

                giuliomoro The use of plain or __wrap() threads really depends on what each thread is doing.

                Great, that clears things up well for me.

                giuliomoro To sum up, I would encourage you to implement all you need using the regular pthread, then compile and link your application adding the appropriate LDFLAGS. Try to run it this way and see if it all works fine. If you have one thread that switches mode very often (you can monitor the MSW column in the threads stats in /proc/xenomai/sched/stat while the program is running), so much that it becomes a performance issue, then you may want to consider moving that (those) thread(s) to a regular Linux thread (calling __real_pthread...), and use XDDP to communicate with it from a RT thread (if needed).

                That makes sense to me, I will definitely go this route and let you know how it turns out.

                giuliomoro Another note: I'd discourage using automatic symbol wrapping in combination with some standard C++ library classes (e.g.: std::thread, std::mutex). See here https://www.xenomai.org/pipermail/xenomai/2017-March/037223.html

                Yeah, I have stuck with only the pthread library, as I was under the impression the standard C++ library didn't necessarily play well with Xenomai, after reading this thread: https://forum.bela.io/d/530-audio-thread-mode-switch-std-thread/21 and this message: https://www.xenomai.org/pipermail/xenomai/2017-March/037220.html

                Thanks again for your help, this has all been extremely useful.

                  wa3573 Yeah, I have stuck with only the pthread class, as I was under the impression the standard C++ library didn't necessarily play well with Xenomai,

                  Don't get me wrong: you can use std::thread, std::condition_variable_any and std::mutex with no problem at all within a Xenomai application, as long as you do not enable automatic symbol wrapping. If you do, then most likely these will stop working, depending on what happened with symbol wrapping.

                  I did a limited re-implementation of std::mutex and std::condition_variable_any here that automatically turns the current thread to a Xenomai one if needed.

                  a month later

                  Hey Giulio, hope you're doing well. I am in the process of integrating Touchkeys with the Bela core, and am running into a strange issue. When I compile a test project (external, not in the IDE) containing a single source file (main.cpp) which is simply a combination of the default libpd_render.cpp and default_main.cpp files. I attempted to attach it to this reply, but the upload is not completing. The source compiles fine, but when running, it always breaks on the first call to new Midi(); in openMidiDevice(). Here's some debug info:

                  [New Thread 0xb6987450 (LWP 724)]
                  Running Pd 0.48-2
                  Audio channels in use: 2
                  Analog channels in use: 8
                  Digital channels in use: 16

                  Thread 1 "bela_custom_mai" hit Breakpoint 1, openMidiDevice (name="hw:1,0,0", verboseSuccess=false,
                  verboseError=false) at ../Main.cpp:86
                  86 Midi* newMidi = new Midi();
                  (gdb) step
                  bela_custom_main: malloc.c:2406: sysmalloc: Assertion '(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.

                  When swapped out for malloc() I get memory corruption from the same spot, every time. This is confusing to me, as I am not sure what could be causing the memory corruption with the default renderer and default main files.

                  Could this have anything to do with optimizations used (or not) when compiling? I believe I still had -O0 but will double-check -O3 yields the same results. PD projects compiled in the IDE do not crash. Any advice or ideas would be appreciated, thanks!

                  Hmm not sure, I would guess there is something wrong before that line. You can paste the full file in your post, but 600 lines in here can be hard to parse. So I'd recommend you put it on github and put a link here so we can review it there.

                  https://github.com/wa3573/misc/blob/master/Main.cpp

                  Indeed, I figure the problem is before the first call to openMidiDevice(). The only explicit memory allocation I see before that point is a malloc() at line 346 char* str = (char*)malloc(sizeof(char) * strSize); but this is free'd just afterwards.

                  I did not change anything from the default files, which is why I am stumped, and am thinking that perhaps there is a problem being introduced during linking

                  that builds and runs fine for me. Do you know what version of the core code you have? Also the image number would be of interest (grep v0 /etc/motd).

                  I checked that earlier memory allocation again and it looks good, can you try commenting out lines 343 to 352 and see if that changes anything?

                  Also, can you try compiling simply default_libpd_render.cpp in your project folder without the lines from main.cpp?

                  Interesting. What flags did you use for the compiler/linker? Just so we're on the same page. From what I understand, g++ on Bela is just an alias for arm-linux-gnueabihf-g++, correct? As in, it will link against those libraries without having to call that linker/compiler explicitly. I recently updated the core, around when we started this project, so it is recent. However, I will post that info in a little while.

                  I will try all of that and get back to you with the info shortly. Although, if I do not include the lines from main.cpp the linker complains of an undefined reference to main(), since I am compiling this outside of the IDE, I assume.

                  Hmm ok, I had just built it as a Bela program. Why don't you send me the full command line you use to build the file?

                  g++ should be an alias for arm-linux-gnueabihf-g++ , yes. We use clang++ for all of the Bela stuff, but g++ should work equally fine. To see all the flags used by Bela for compilation and linking, just add AT= to your command line when building a project. For instance, after modifying some files in the project myProject, run this (equivalent to adding AT= in the "Make parameters" field in the IDE and hitting the "build" button in the IDE)

                  make -C ~/Bela PROJECT=myProject AT=

                  Thanks, that AT= addition was helpful just to see how it was building internally. Here's the output for that when pointed towards a test PD project without the custom render and main files:

                  /usr/bin/clang++ -Llib/ -pthread -o "/root/Bela/projects/test/test" build/core/FormatConvert.o build/core/OscillatorBank_routines.o build/core/math_runfast.o build/core/Gpio.o build/core/I2c_Codec.o build/core/PulseIn.o build/core/scope_ws.o build/core/RTAudio.o build/core/UdpClient.o build/core/WriteFile.o build/core/RTAudioCommandLine.o build/core/OSCClient.o build/core/WriteFile_c.o build/core/AuxTaskRT.o build/core/board_detect.o build/core/AuxTaskNonRT.o build/core/Midi.o build/core/AuxiliaryTasks.o build/core/I2c_TouchKey.o build/core/Midi_c.o build/core/Scope.o build/core/PruBinary.o build/core/PRU.o build/core/UdpServer.o build/core/OSCServer.o build/core/GPIOcontrol.o build/core/Spi_Codec.o build/core/JSONValue.o build/core/DigitalChannelManager.o build/core/JSON.o ./build/core/default_main.o ./build/core/default_libpd_render.o -Wl,--no-as-needed -L/usr/xenomai/lib -lcobalt -lmodechk -lpthread -lrt -lprussdrv -lstdc++ -Wl,--no-as-needed -L/usr/xenomai/lib -lcobalt -lmodechk -lpthread -lrt -lasound -lseasocks -lNE10 -lmathneon -lsndfile -lpd -lpthread

                  So, I've tried using that combination, building as follows:

                  Invoking: GCC C++ Compiler
                  g++ -std=c++14 -I/usr/local/include/libpd/ -I/home/juniper/Downloads/liblo-0.29 -I/home/juniper/Downloads/boost_1_69_0 -I/root/Bela/include -pthread -O3 -g3 -Wall -c -fmessage-length=0 -MMD -MP -MF"Main.d" -MT"Main.o" -o "Main.o" "../Main.cpp"

                  Invoking: GCC C++ Linker
                  g++ -o "bela_custom_main" ./Main.o -L/root/Bela/lib/ -pthread -lbelaextra -lbela -llo -Wl,--no-as-needed -L/usr/xenomai/lib -lcobalt -lmodechk -lpthread -lrt -lprussdrv -lstdc++ -Wl,--no-as-needed -L/usr/xenomai/lib -lcobalt -lmodechk -lpthread -lrt -lasound -lseasocks -lNE10 -lmathneon -lsndfile -lpd -lpthread

                  And this compiles and links successfully, but I get the same error. Same thing if I switch out g++ for clang++

                  grep v0 /etc/motd
                  Bela image, v0.3.6b, 23 October 2018

                  Not sure where to look for the core code version

                    Right the problem is that in the compilation step you don't have the command-line -D options that Bela uses.

                    If I run

                    make -C ~/Bela PROJECT=test-we run AT=

                    I get

                    clang++ -I/root/Bela/projects/test-we -I./include -I./build/pru/ -I/usr/xenomai/include/cobalt -I/usr/xenomai/include -march=armv7-a -mfpu=vfp3 -D_GNU_SOURCE -D_REENTRANT -fasynchronous-unwind-tables -D__COBALT__ -D__COBALT_WRAP__ -DXENOMAI_SKIN_posix -DXENOMAI_MAJOR=3 -O3 -march=armv7-a -mtune=cortex-a8 -mfloat-abi=hard -mfpu=neon -ftree-vectorize -ffast-math -DNDEBUG -DBELA_USE_RTDM -I/root/Bela/resources/stretch/include -std=c++11 -DNDEBUG -Wall -c -fmessage-length=0 -U_FORTIFY_SOURCE -MMD -MP -MF"/root/Bela/projects/test-we/build/Main.d" -o "/root/Bela/projects/test-we/build/Main.o" "/root/Bela/projects/test-we/Main.cpp"

                    Some of those defines are needed for the header files to work properly. include/Midi.h, for instance, requires XENOMAI_SKIN_native or XENOMAI_SKIN_posix to be defined. This will change at some point when we drop support for the native skin, but for now it's important to have them. Also, I think you should make sure you use the same include paths (including the xenomai ones).