audio thread mode switch - std::thread

thetechnobear · Apr 25, 2018

so ive got the basics working, and seems to be ok, for threads/cond, but if I try to wrap pthread_mutex_lock,
i get

terminate called after throwing an instance of 'std::system_error'
  what():  Operation not permitted
Aborted

searching on the internet, says this is usually caused by not linking pthread, but i can see this is linked, and also using nm, i can see that pthread_mutex_lock is wrapped....

i saw a note about issues with static mutex, but this is wrapped in a class that is dynamically created, so I'm not sure this is the case,

any thoughts?

giuliomoro · Apr 25, 2018

From what context are you invoking __wrap_pthread_mutex_lock()?

The error you are getting is a c++ exception, so I would be surprised if it is thrown as a consequence of a C call, although I am not familiar with the inner workings of libstdc++.

thetechnobear · Apr 26, 2018

so so it appears i always get this behaviour even in a simple example

with the following defined:

--wrap pthread_mutex_init
--wrap pthread_mutex_lock
--wrap pthread_mutex_unlock

using the following code:

    pthread_mutex_t m;
    LOG_0("cobalt ");
    LOG_0("c init " << pthread_mutex_init(&m,0));
    LOG_0("c lock1 " << pthread_mutex_lock(&m));

i get the following output:

cobalt 
c init 1
c lock1 1

note: all calls should return 0 for success, but 1 = EPERM
only really the first, pthread_mutex_init, is the real issue as its obviously not initialising the mutex successfully.

(also the second lock does not block, indicating that its not just the error codes, really the lock is not taken)

if I remove the pthread_mutex_* wraps , it all works as expected...

Im using the following compiler options

            message(STATUS "BELA optimized")
            # /usr/xenomai/bin/xeno-config --skin=cobalt --no-mode-check --cflags
            set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -I/usr/xenomai/include/cobalt -I/usr/xenomai/include -march=armv7-a -mfpu=vfp3 -D_GNU_SOURCE -D_REENTRANT -D__COBALT__")
            set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wl,@${PROJECT_SOURCE_DIR}/xenomai.wrappers")
            set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=armv7-a -mtune=cortex-a8 -mfloat-abi=hard -mfpu=neon -ftree-vectorize --fast-math")
            #  /usr/xenomai/bin/xeno-config --skin=cobalt --no-auto-init --no-mode-check --ldflags
            set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--no-as-needed -L/usr/xenomai/lib -lcobalt -lmodechk -lpthread -lrt ")
            set(CMAKE_MODULE_LINKER_FLAGS "${CMAKE_MODULE_LINKER_FLAGS} -Wl,--no-as-needed -L/usr/xenomai/lib -lcobalt -lmodechk -lpthread -lrt ")
            set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -Wl,--no-as-needed -L/usr/xenomai/lib -lcobalt -lmodechk -lpthread -lrt ")
            #set(CMAKE_STATIC_LINKER_FLAGS "${CMAKE_STATIC_LINKER_FLAGS} -Wl,--no-as-needed -L/usr/xenomai/lib -lcobalt -lmodechk -lpthread -lrt ")

they are defined in cmake, but I can see when I compiled , they are applied correctly.
also, if I use nm on the built binary, I can clearly see the wrap_pthread functions are bound, rather than the underlying pthread functions.

not sure what to try next really ...

EDIT:
so If I remove --wrap pthread_mutex_init , and leave the wrapped lock/unlock this fails on the lock
( I guess this is expected, as the mutex is now a pthread mutex, and wrapped lock is for a cobalt mutex ?!)
i.e.

cobalt 
c init 0
c lock1 1

giuliomoro · Apr 26, 2018

You need to call xenomai_init() before you can use Xenomai services. This is normally not needed if running inside a Bela project because it is called in the background.

This program:

#include <iostream>
#include <pthread.h>
#include <xenomai/init.h>

int main ()
{
	int argc = 0;
	char *const *argv;
	xenomai_init(&argc, &argv);
	pthread_mutex_t m;
	std::cout << "cobalt " << " \n";
	std::cout << "c init " << pthread_mutex_init(&m,0) << " \n";
	std::cout << "c lock1 " << pthread_mutex_lock(&m) << " \n";
	std::cout << "c try lock "<< pthread_mutex_trylock(&m) << " \n";
	std::cout << "c lock2 "<< pthread_mutex_lock(&m) << " \n";
	std::cout << "c lock3 "<< pthread_mutex_lock(&m) << " \n";
	std::cout << "c unlock "<< pthread_mutex_unlock(&m) << " \n";
	std::cout << "c lock4 "<< pthread_mutex_lock(&m) << " \n";
	std::cout << "//cobalt" << " \n";
	return 0;
}

compiled like this:

g++ -c -pthread test.cpp `/usr/xenomai/bin/xeno-config --skin=cobalt --cflags` -o test.o  &&\
 g++ test.o -o test `/usr/xenomai/bin/xeno-config --skin=cobalt --ldflags --no-auto-init`  -Wl,--wrap=pthread_mutex_init -Wl,--wrap=pthread_mutex_lock -Wl,--wrap=pthread_mutex_trylock

gives:

root@bela:~# ./test
cobalt
c init 0
c lock1 0
c try lock 16

and then it hangs, correctly.

thetechnobear · Apr 26, 2018

thanks for that... that solved that issue...

now getting errors with semaphores (arghh, this is tiring !)

#include <iostream>
#include <pthread.h>
#include <semaphore.h>
#include <xenomai/init.h>

#include <thread>
#include <chrono>


sem_t sem;

void* func(void*) {
    while(true) {
        bool timeout = false;

        int t = 1000 * 1000;
        struct timespec ts = { t/1000000, 1000*(t%1000000) };
        clock_gettime(CLOCK_REALTIME, &ts);
        t += (ts.tv_nsec/1000);
        ts.tv_nsec = 0;
        ts.tv_sec += t/1000000;
        ts.tv_nsec += 1000*(t%1000000);
        errno=0;
        int rc=sem_timedwait(&sem,&ts);
        std::cout << "timedwait " << rc << " : " << errno << std::endl;
    }
    return 0;
}

int main ()
{
    int argc = 0;
    char *const *argv;
    xenomai_init(&argc, &argv);
    std::cout << "si " << sem_init(&sem,0,0) << std::endl;

    pthread_t thr;
    std::cout << "thr " << pthread_create(&thr,0,func,0) << std::endl;

    for(int i=0;i<10;i++) {
        sem_post(&sem);
        std::this_thread::sleep_for(std::chrono::seconds(2));
    }
    std::this_thread::sleep_for(std::chrono::seconds(10));

    return 0;
}

so what should happen , is the semaphore should timeout every other time, as its waiting 1 second, but the producer is adding every 2 seconds

so without the wrapper i correctly get

si 0
thr 0
timedwait 0 : 0
timedwait -1 : 110
timedwait 0 : 0
timedwait -1 : 110
timedwait 0 : 0
timedwait -1 : 110
timedwait 0 : 0

but if i wrap with

--wrap pthread_create
--wrap sem_init
--wrap sem_post
--wrap sem_timedwait

I get nothing useful

si 0
thr 0
timedwait 0 : 0
timedwait 0 : 11
timedwait 0 : 11
timedwait 0 : 11

the return code is wrong.. not -1, and error 11 (EAGAIN) is not even defined for sem_timedwait.
as far as i can see , sem_timedwait, seems to be behaving like set_wait, i.e. it just waits for a post, and does not timeout.

... im wondering if its something to do with clocks, but i cannot find any examples with xenomai for timedwaits

note: Im compiling the same as your example , just I use -Wl, @file ...
(using nm, i then check things are wrapped as expected)

Im starting to get the feeling i should ditch the posix wrappers, they seem really unreliable...
(they kind of worked in some instances with std c++ classes, and then in other cases not)
but that means a whole load of work to start abstracting stuff, just to get bela working

(and i thought c++ stl was the abstraction )

giuliomoro · Apr 26, 2018

if it returns 0 then the errno should be the last errno set by some other call? This explains why EAGAIN is not returning an expected value.

This said, I am not sure what the problem is. I can see by cat /proc/xenomai/registry/usage that a Xenomai registry entry is used when initing the semaphore.

There actually is an instance where sem_timedwait() is used in the Xenomai code base:

./testsuite/smokey/cpu-affinity/cpu-affinity.c:	if (!__Terrno(ret, __STD(sem_timedwait(&context->done, &ts)))) {

have a look in /opt/xenomai-3/testsuite/smokey/cpu-affinity/cpu-affinity.c and see if that helps at all.

giuliomoro · Apr 27, 2018

@thetechnobear got it: it was easier than expected. You need to wrap clock_gettime().
If you try this with the same options as above (that is without wrapping clock_gettime() at compile time)

#include <iostream>
#include <pthread.h>
#include <semaphore.h>
#include <xenomai/init.h>

#include <thread>
#include <chrono>


sem_t sem;

void* func(void*) {
    while(true) {
        bool timeout = false;

        int t = 1000 * 1000;
        struct timespec ts = { t/1000000, 1000*(t%1000000) };
	double time;

        clock_gettime(CLOCK_REALTIME, &ts);
	time = ts.tv_sec + ts.tv_nsec/1000000000.0;
	std::cout << "clock_gettime: "<< time << "\n";

        __wrap_clock_gettime(CLOCK_REALTIME, &ts);
	time = ts.tv_sec + ts.tv_nsec/1000000000.0;
	std::cout << "clock_gettime: "<< time << "\n";

        t += (ts.tv_nsec/1000);
        ts.tv_nsec = 0;
        ts.tv_sec += t/1000000;
        ts.tv_nsec += 1000*(t%1000000);
        errno=0;
        int rc=sem_timedwait(&sem,&ts);
        std::cout << "timedwait " << rc << " : " << errno << std::endl;
    }
    return 0;
}

int main ()
{
    int argc = 0;
    char *const *argv;
    xenomai_init(&argc, &argv);
    std::cout << "si " << sem_init(&sem,0,0) << std::endl;

    pthread_t thr;
    std::cout << "thr " << pthread_create(&thr,0,func,0) << std::endl;

    for(int i=0;i<10;i++) {
        sem_post(&sem);
        std::this_thread::sleep_for(std::chrono::seconds(2));
    }
    std::this_thread::sleep_for(std::chrono::seconds(10));

    return 0;
}

(note that I am calling the POSIX clock_gettime() and the Cobalt __wrap_clock_gettime() and comparing the results). You will get:

si 0
thr 0
clock_gettime: 1.52479e+09
__wrap_clock_gettime: 707.068
timedwait 0 : 0
clock_gettime: 1.52479e+09
__wrap_clock_gettime: 707.069
timedwait -1 : 110
clock_gettime: 1.52479e+09
__wrap_clock_gettime: 708.07
timedwait 0 : 11
clock_gettime: 1.52479e+09
__wrap_clock_gettime: 709.068
timedwait -1 : 110
clock_gettime: 1.52479e+09
__wrap_clock_gettime: 710.068
timedwait 0 : 11
clock_gettime: 1.52479e+09
__wrap_clock_gettime: 711.068

so there are two CLOCK_REALTIME, one for Xenomai and one for Linux. Earlier on, you were reading from the Linux one and using it to set an event for the Xenomai one.

So you can keep your code as above but add --wrap clock_gettime to your @ file.

thetechnobear · Apr 27, 2018

Thanks @giuliomoro thats fixed it...

after a few other small hurdles, I finally I have success on my project, no MSW (ok, one at startup but thats ok )
so very happy!

most of my fights to get this working has been to do with the way std c++ and the posix layer interact, or don't.
many places you can step around the issues (mostly using native_handle()) , but some you can't
this is not because the lib has pthreads compiled in (it doesn't it just has references, which correctly get wrapped) , but i think its because it does have some internal state, which just seems to get in the way of some objects.
(e.g. std::thread with std::mutex, can be made to work, but std::thread with semaphores, nope!)

add to that , creating test programs often creates other issues... and its a bit of a nightmare,
thought I'm sure once you have done it once, its not so bad.

anyway thanks Giulio , I really appreciated your support, and patience
... you'll be able to see it at superbooth.

giuliomoro · Apr 30, 2018

This got me confused a while ago, and to this day, I am still not quite sure I get the answer: http://www.xenomai.org/pipermail/xenomai/2017-March/037220.html

thetechnobear you'll be able to see it at superbooth.

am I getting a cake?

thetechnobear · Apr 30, 2018

giuliomoro This got me confused a while ago, and to this day, I am still not quite sure I get the answer: http://www.xenomai.org/pipermail/xenomai/2017-March/037220.html

yeah, thats in line with what Ive found... I had a bit of a look at the C++ STL code, I think the issue is whilst it does use the pthreads library (and these get wrapped) theres some additional state and logic, which messes sometimes
e.g. conditional vars dont work at all, mutex and threads kind of work, sometimes.
it was the 'sometimes' that really hit me, especially as i could see the calls were being wrapped.

as i said, in some places you can workaround using the native_handle() of STL, this was useful for me for threads, where I didn't want to change the C++ class interface, which would get really messy... but still ended up in quite a few #ifdef .
I guess once Ive got everything work as I want it on Bela, I'll take the pain, and refactor , so I can remove the conditional macros and replace with a 'platform layer' for bela. (will be useful for win32 as well )

giuliomoro · Apr 30, 2018

what happens if you avoid wrappers at linking time and instead you explicitly use something like myrt_pthread_create(), defined as:

 int myrt_pthread_create(pthread_t *restrict thread, const pthread_attr_t *restrict attr, void *(*start_routine)(void *), void *restrict arg)
{
#ifdef XENOMAI
return __wrap_pthread_create(thread, attr, start_routine, arg);
#else
return pthread_create(thread, attr, start_routine, arg);
#endif
}

and so on for each function you use.
This would force you to use myrt_ functions (and not C++ stuff) for your RT stuff, but at least you can keep all the #ifdefs in one place and you avoid using linking-time wrappers which may confuse the stl.

thetechnobear · Apr 30, 2018

giuliomoro yeah, abstraction is the answer , but windows is not posix compliant to returning an int as a common denominator wont work really.

for now what ive done is:

#ifdef __COBALT__
    pthread_t ph = writer_thread_.native_handle();
    pthread_create(&ph, 0,write_thread_func,this);
#else
    writer_thread_ = std::thread(write_thread_func, this);
#endif

so the header file/interface is still a std::thread, and we just 'overwrite' the thread handle of the std::thread, in the case of xenomai.
interestingly it just appears to be creation thats the issue, join/thread destroy etc, all work ok. (as far as ive seen so far )

its not ideal, what i will probably do in the mid term, is just abstract everything up a couple of levels, so have specific platform implementations of things like producer/consumer queues... this means i can also deal with things like if the platform has semaphores or not.

really all this is not an issue, once you have experience with it - just catches you out when your not familiar with xenomai posix skins quirks (when used with C++)

Aalexvoina · Apr 28, 2020

My JuceBela program crashes with a segmentation fault in __wrap_clock_gettime () from /usr/xenomai/lib/libcobalt.so.2

I see a lot of relevant discussion here but before digging too deep do you have and quick suggestions something that I might have missed?

I have started from your demo project @giuliomoro, so all the preprocessor defs like COBALT, COBALT_WRAP and other fields should be in place.

giuliomoro · Apr 28, 2020

has Xenomai been initialised by the time __wrap_clock_gettime() is called? Xenomai is initialized with a call toxenomai_init(), which happens in Bela_initAudio() https://github.com/BelaPlatform/Bela/blob/master/core/RTAudio.cpp#L244 . You can also initialise it by yourself at an earlier stage if needed (e.g.: as soon as you get into main()), and that's safe (you will only get a warning that it has been initialised twice).

Aalexvoina · Apr 28, 2020

i'll try that. I just assumed JUCE is doing it somewhere

Aalexvoina · Apr 28, 2020

that fixed it! Next time I won't assume anything. Cheers!

giuliomoro · Apr 28, 2020

You can remove --no-auto-init from the linker flags if you want it to be automatically inited at startup, I think.

Aalexvoina · May 7, 2020

hi i'm trying to use valgrind to measure memory footprint of my app, and hopefully catch some memleaks if any exist, but it stops early. Note that i'm using the v0.4.0alpha OS image.

root@bela:~# valgrind --tool=massif ./my_app
==13172== Massif, a heap profiler
==13172== Copyright (C) 2003-2017, and GNU GPL'd, by Nicholas Nethercote
==13172== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==13172== Command: ./my_app
==13172==
--13172-- WARNING: unhandled arm-linux syscall: 983106
--13172-- You may be able to write your own handler.
--13172-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--13172-- Nevertheless we consider this a bug. Please report
--13172-- it at http://valgrind.org/support/bug_reports.html.
0"000.000| BUG in low_init(): [main] Cobalt core not enabled in kernel
==13172==

I'm suspecting it is because I'm calling xenomai_init() explicitly, and then it is called again at some point by JUCE code because I can see 2 calls to xenomai_init() on the dump file generated by valgrind, the second one being at the very end of the file.

I have tried to add a linker flag --no-auto-init but it is not recognized by the compiler. Any ideas?

giuliomoro · May 7, 2020

alexvoina g --no-auto-init but it is not recognized by the compiler. Any ideas?

that flag needs to be passed to /usr/xenomai/bin/xeno-config when using it to obtain the linker flags. The full call to run on the board would be

/usr/xenomai/bin/xeno-config --skin=posix --ldflags --no-auto-init

which returns

-Wl,--no-as-needed -Wl,@/usr/xenomai/lib/cobalt.wrappers -Wl,@/usr/xenomai/lib/modechk.wrappers    -L/usr/xenomai/lib -lcobalt -lmodechk -lpthread -lrt

I normally then remove the two wrappers files (as I like to manually call the Xenomai versions of POSIX functions if and when needed, final result is:

-Wl,--no-as-needed -L/usr/xenomai/lib -lcobalt -lmodechk -lpthread -lrt

see here.

Several years ago, when I was looking into running Valgrind on Bela, I stopped looking because I read somewhere that it didn't play nicely with Xenomai, possibly because of symbol wrapping. In the end I never tried it, but the above line disables all wrappings, so maybe it will work?

alexvoina Cobalt core not enabled in kernel

this error here would seem to suggest that you are not running it on Bela. Are you actually running this on Bela?????

alexvoina I'm suspecting it is because I'm calling xenomai_init() explicitly, and then it is called again at some point by JUCE code because I can see 2 calls to xenomai_init() on the dump file generated by valgrind, the second one being at the very end of the file.

if this annoys you, you could try calling Bela_initAudio() early on, and then when it gets called again, the gXenomaiInited variable will prevent it from initialising again. I don't think you can access that variable from the Juce program though. Alternatively you can edit your core/RTAudio.cpp code so that it does not initialise xenomai, then rebuild the Bela lib (make -C /root/Bela lib) and then it's up to your JUCE app to call it early on.

Aalexvoina · May 8, 2020

giuliomoro this error here would seem to suggest that you are not running it on Bela. Are you actually running this on Bela?????

Yes of course. And I can see some calls to libcobalt.so on valgrind's dump file I suppose succesfully executed, so I don't understand what exactly it means when it says that 'cobalt core is not enabled in kernel'. Also if you check my comment above you can see that it was crashing in libcobalt.so.2 before I had forced the xenomai_init() in my main function (since I solved that I assume it works). My program executes quite nicely, but this is just for the purpose of diagnosis and optimization.

On this topic, I still have the JUCE_ALSA flag Enabled as well as JUCE_BELA for the juce_audio_devices module. Is there a chance that my application is running on the Bela as if it would on a standard Linux machine, and not actually make use of the Bela's low latency capabilites I suppose powered by xenomai?

giuliomoro -Wl,--no-as-needed

added this but still getting the warning message, and Valgrind behaves the same
3"059.413| WARNING: [main] duplicate call from main program to xenomai_init() ignored
3"059.727| WARNING: [main] (xeno-config --no-auto-init disables implicit call)

so I guess I have to try rebuilding Bela