@rwarmstr got another question if you have time: I profiled the wide_vadd tutorial with some small
modification to add float instead of integers. I also used 4 concurrent kernels.
Environment: U250, 48 core CPU, Ubuntu 18, latest XRT.
Timing for a single run of adding two buffers of size 810241024 ints was 24 ms.
This seems like a long time to simply add two buffers together - is there anything else I can
do to speed things up. For example, is there a way of harnessing the PLRAM on the board
to make it faster ?
Any insights here would be really appreciated.
@rwarmstr got another question if you have time: I profiled the
wide_vaddtutorial with some smallmodification to add float instead of integers. I also used 4 concurrent kernels.
Environment: U250, 48 core CPU, Ubuntu 18, latest XRT.
Timing for a single run of adding two buffers of size 810241024 ints was
24 ms.This seems like a long time to simply add two buffers together - is there anything else I can
do to speed things up. For example, is there a way of harnessing the PLRAM on the board
to make it faster ?
Any insights here would be really appreciated.