23 | | Our kernel is directly written in IL. After compilation into a machine code, I analyze the generated VLIW instructions. Results are obtained with fglrx 8.65.4 (Catalyst 9.9) and CAL 1.4beta. Almost 90% of the time, "5870_opt kernel" runs at maximum efficiency (here, I mean more than 4 slots are occupied by some instructions). It is compared to 83% in the case of our old kernel. |
| 23 | Our kernel is directly written in IL. After compilation into a machine code, I analyze the generated VLIW instructions. Results are obtained with fglrx 8.65.4 (Catalyst 9.9) and CAL 1.4beta. |
| 24 | |
| 25 | Both RV770 and RV870 archtecture has 5-way VLIW units at its heart. Depnding on detailed computations, it is supposed that the device driver (or internal compiler?) tries to fill 5-slots as much as possible. The first column shows a number of occupied slots and the second column shows a number of corresponding VLIW instructions appeared in the kernel. The third column indicates a fraction of VLIW instructions with a given number of slots. The last row presents a total numberf VLIW instructions in the kernel. |