Strictly speaking if you are measuring the room impulse response you are dealing with an inherently high latency system (relative to the latency <1ms of Bela), and depending on your method of estimation (such as those based on Farina's work, MLS or even wavelets and impulses), I would bet that the time it takes you to measure and perform your estimation at a reasonable resolution between 50 and 500Hz probably takes significantly longer than the latency of Bela. If you are running at 48kHz sample rate and you want 1Hz resolution, you need a whole seconds worth of recording + the delay from starting propagation to the measurement of the first wavefront. Even if you are doing a real time i/o in the frequency domain, you are looking at the same kinds of restrictions on minimal number of samples needed in your buffer to get good enough resolution for your transfer function to be of much use, particularly if you are sniffing for room modes and trying to perform some correction.
One would imagine if your buffer is so large that any 'drift' or random noise will average to be spectrally closer to the H of the Bela, so you could probably just window out the regions of the spectrum you aren't interested in. In this case you can pay attention to the SNR and work on maximising that to get the best representation of peaks. Remember that with a long enough sample, many noise sources will approach having a spectrum that is ruler flat, so more time is better, and time also trumps sample rate.
If I were working on generating Hs between 50 and 500 for a room, assuming I had the anechoic H of the sound source and the receiver, I would aim to work at a much lower sample rate, and I would be working on a method of getting the data by measuring over a long period of time. In this case I could work to estimate and reduce the effect of the noise source on the spectrum of the H, and therefor the impulse response.