So, I've had an AI for a few months, but PhD submission / IDE / Trill / Soul and other things meant we didn't do a lot with it yet.
Current state is:
- we can run audio with ALSA, however one of the internal clocks seems to be a few MHz off, so when requesting 48kHz sampling rate. This at least validates that the pinouts should 99% be fine and the Bela cape should work without hardware modifications.
- haven't tested SPI, it seems that the driver doesn't load properly, so that's the next thing I will try out
- PRU code should need a fair bit of rewriting (besides the offsets, the McASP is slightly different on the AM5729 vs the AM3358)
- haven't looked for a Xenomai build yet, however this is the very last step (all the code base can run with some minor mods without Xenomai).
As soon as I have managed to run the audio codec in ALSA mode on the release image (my experiments so far were on a pre-release image), I will post it here. I think the CTAG codec could also be made to work fairly easily (although @henrix may have mentioned that the driver broke on recent Linux releases?). It also seems that to change the pinmuxing, you have to rebuild uboot and bake-in the pinmux settings (the .dtb
/.dtbo
are only read/used by Linux to load the appropriate drivers, but the pins would have to be set BEFORE linux starts, though I don't remember exactly why).
In terms of raw CPU performance, generating 100 seconds of techno-world (without I/O, simply crunching the numbers) takes:
- 48.2s on BBB (at 1GHz)
- 9.4s on the AI (at 1GHz)
- 6.3s on the AI (at 1.5GHz)
- 0.94s on my laptop (i5 2.9GHz)
So, for the same clock, using mainly VFP instructions (as libpd
does), the AI is about 5x faster (as expected, because the VFP itself should be 10x faster, but then there are load/store operations in there as well). I would expect fully-optimized NEON code to run about the same on the two, although the A15 could have some extra boost in that one too (or not: the NEON on the A9, for instance, is slightly slower than on the A8).
The main issue with the AI going forward, in my opinion, is heat dissipation. It ships with a heatsink, but I understand it needs a fan to be able to run at full speed (1.5GHz). This means that a cape cannot fit the normal way, as there is need to leave some extra room for the fan). I am using some extra stackable headers to leave more vertical space for the fan, but that could be an issue in some environments where space is at a premium (e.g.: eurorack modules (wink wink nudge nudge)).
(the following is meta in that I am resting the board on a heatsink for the purpose of taking the picture)
The image it ships with relies heavily on the cpu governor for thermal throttling, but keep in mind that this would have to be disabled on a Xenomai image ( and is - either way - not very RT-friendly). I haven't looked at the GPU/DSP/EVE/M4 yet, although I would want at least to be able to switch them off when unused, to reduce power consumption, but it seems that this is difficult/impossible at present, although maybe their clocks can be dropped. My understanding is that currently you may be able to run at 1GHz with just the heatsink, but I haven't seen this confirmed anywhere yet ( the system reference manual is quickly coming together but is not yet completed).
@henrix worked on a DSP library for the DSPs on the X15 which should work just fine on the AI (the AM5728 on the X15 is the same SoC as the AM5729 on the AI, except it doesn't have the EVEs), here it is, though I haven't looked at it in years.
This is actually a dual PRU subsystem, which means 2x of what was already on the BBB (2 cores, 1 UART, 1 industrial ethernet, 1 IRQ controller, 28k of DATA RAM, etc). I don't think there are any immediate consequences for our application (we manage to do everything on one PRU core, and use only one of the 10 channels on the interrupt controller, and nothing else), but I think it could help for some higher-bandwidth applications (such as beaglelogic), o running several of these in parallel, or if there was custom communication to be implemented, though - honestly - in that case I'd probably go for one of the M4 instead 🙂.
A final thought: the AI is a great board, but the heat dissipation and its cost (note: it comes with built-in wifi) do not make it ideal for all situations, and taking advantage of all its extra features (DSP/EVE/GPU) from within the Bela environment would require considerable effort from both us and the user. While we hope that existing and future Bela users will be able to have the possibility to upgrade their Bela setup to include the AI, and enjoy the associated "free" speed up, we also have to strike the right balance between how much time we can pour into it vs supporting the existing community and working towards new products.