Investigating Single Precision Floating General Matrix Multiply in Heterogeneous Hardware