|
| 1 | +== "Zvzip" Extension for Reordering Structured Data, Version 0.1 |
| 2 | + |
| 3 | +This chapter describes the Zvzip standard extension for reordering structured |
| 4 | +data in vector registers. These instruction address usages such as packing and |
| 5 | +unpacking data structures such as color components of a pixel, real and |
| 6 | +imaginary components of complex numbers, transposing small matrices, among |
| 7 | +others. |
| 8 | + |
| 9 | +=== Vector Zip Instruction |
| 10 | + |
| 11 | +The vector zip instruction (VZIP) interleaves element at a given index |
| 12 | +in each source vector register group, in destination vector register group. This |
| 13 | +instruction operates with an effective vector length of 2*VL. The destination |
| 14 | +EMUL is 2xLMUL. The instruction is reserved when LMUL is 8. |
| 15 | + |
| 16 | +[wavedrom, , svg] |
| 17 | +.... |
| 18 | +{reg:[ |
| 19 | +{bits: 7, name: 'OP-V'}, |
| 20 | +{bits: 5, name: 'vd'}, |
| 21 | +{bits: 3, name: 'OPMVV'}, |
| 22 | +{bits: 5, name: 'vs1'}, |
| 23 | +{bits: 5, name: 'vs2'}, |
| 24 | +{bits: 1, name: 'vm'}, |
| 25 | +{bits: 6, name: '111110', attr: ['VZIP']}, |
| 26 | +]} |
| 27 | +.... |
| 28 | + |
| 29 | +The destination vector register group may overlap the source vector register |
| 30 | +group if the overlap is in the highest-numbered part of the destination |
| 31 | +register group and the source EMUL is at least 1. If the overlap violates these |
| 32 | +constraints, the instruction encoding is reserved. |
| 33 | + |
| 34 | +---- |
| 35 | +vzip.vv vd, vs2, vs1, vm # for i in 0 to 2*VL |
| 36 | + # vd[i] = (i % 2 == 0) ? vs2[i/2] : vs1[i/2] |
| 37 | +---- |
| 38 | + |
| 39 | +=== Vector Unzip Even and Unzip Odd Instructions |
| 40 | + |
| 41 | +The vector unzip-even instruction (VUNZIPE) extracts the VL number of |
| 42 | +even-indexed elements from the source vector register group into the destination |
| 43 | +vector register group. |
| 44 | + |
| 45 | +The vector unzip-odd instruction (VUNZIPO) extracts the VL number of |
| 46 | +odd-indexed elements from the source vector register group into the destination |
| 47 | +vector register group. |
| 48 | + |
| 49 | +[wavedrom, , svg] |
| 50 | +.... |
| 51 | +{reg:[ |
| 52 | +{bits: 7, name: 'OP-V'}, |
| 53 | +{bits: 5, name: 'vd'}, |
| 54 | +{bits: 3, name: 'OPMVV'}, |
| 55 | +{bits: 5, name: 'op', attr: ['01011','01111']}, |
| 56 | +{bits: 5, name: 'vs2'}, |
| 57 | +{bits: 1, name: 'vm'}, |
| 58 | +{bits: 6, name: '010010', attr: ['VUNZIPE','VUNZIPO']}, |
| 59 | +]} |
| 60 | +.... |
| 61 | + |
| 62 | +These instructions access 2*VL number of elements in the source vector register |
| 63 | +group and the source EMUL is 2xLMUL. The instruction is reserved when LMUL is 8. |
| 64 | + |
| 65 | +The destination vector register group may overlap the source vector register |
| 66 | +group if the overlap is in the lowest-numbered part of the source register group. |
| 67 | +If the overlap violates these constraints, the instruction encoding is reserved. |
| 68 | + |
| 69 | +---- |
| 70 | +vunzipe.v vd, vs2, vm # vd[i] = vs2[(2 * i)] |
| 71 | +vunzipo.v vd, vs2, vm # vd[i] = vs2[((2 * i) + 1)] |
| 72 | +---- |
| 73 | + |
| 74 | +=== Vector Pair Even and Pair Odd Instructions |
| 75 | + |
| 76 | +The vector pair-even instruction (VPAIRE) interleaves the even-indexed |
| 77 | +elements of the source vector register groups into the destination vector |
| 78 | +register group. |
| 79 | + |
| 80 | +The vector pair-odd instruction (VPAIRO) interleaves the odd-indexed |
| 81 | +elements of the source vector register groups into the destination vector |
| 82 | +register group. |
| 83 | + |
| 84 | +[wavedrom, , svg] |
| 85 | +.... |
| 86 | +{reg:[ |
| 87 | +{bits: 7, name: 'OP-V'}, |
| 88 | +{bits: 5, name: 'vd'}, |
| 89 | +{bits: 3, name: 'funct3', attr: ['OPIVV','OPMVV']}, |
| 90 | +{bits: 5, name: 'vs1'}, |
| 91 | +{bits: 5, name: 'vs2'}, |
| 92 | +{bits: 1, name: 'vm'}, |
| 93 | +{bits: 6, name: '001111', attr: ['VPAIRE','VPAIRO']}, |
| 94 | +]} |
| 95 | +
|
| 96 | +The destination register cannot overlap the source registers and, if masked, |
| 97 | +cannot overlap the mask register. If the overlap violates these constraints, |
| 98 | +the instruction encoding is reserved. |
| 99 | +
|
| 100 | +---- |
| 101 | +vpaire.vv vd, vs2, vs1, vm # vd[i] = (i % 2) == 0 ? vs2[i + 0] : vs1[i - 1] |
| 102 | +vpairo.vv vd, vs2, vs1, vm # vd[i] = (i % 2) == 0 ? vs2[i + 1] : vs1[i + 0] |
| 103 | +---- |
| 104 | +
|
| 105 | +[NOTE] |
| 106 | +==== |
| 107 | +VPAIRO may read one element past VL in vs2 if VL is odd. The general policy for |
| 108 | +such cases is to return the value 0 when the index is greater than VLMAX in the |
| 109 | +source vector register group. |
| 110 | +
|
| 111 | +The folloing example illustrates use of the vector pair-even and pair-odd to |
| 112 | +transpose vl/4 4x4 matrices. |
| 113 | +
|
| 114 | +---- |
| 115 | +vsetivli t0, zero, e32, m1, ta, ma |
| 116 | +vpaire.vv v5, v1, v2 # |a|b|c|d|A|B|C|D|.. |a|e|c|g|A|E|C|G|.. |
| 117 | +vpairo.vv v6, v1, v2 # |e|f|g|h|E|F|G|H|.. -> |b|f|d|h|B|F|D|H|.. |
| 118 | +vpaire.vv v7, v3, v4 # |i|j|k|l|I|J|K|L|.. |i|m|k|o|I|M|K|O|.. |
| 119 | +vpairo.vv v8, v3, v4 # |m|n|o|p|M|N|O|P|.. |j|n|l|p|J|N|L|P|.. |
| 120 | +vsetivli t0, zero, e64, m1, ta, ma |
| 121 | +vpaire.vv v1, v5, v7 # |a e|c g|A E|C G|.. |a e|i m|A E|I M|.. |
| 122 | +vpaire.vv v2, v6, v8 # |b f|d h|B F|D H|.. -> |b f|j n|B F|J N|.. |
| 123 | +vpairo.vv v3, v5, v7 # |i m|k o|I M|K O|.. |c g|k o|C G|K O|.. |
| 124 | +vpairo.vv v4, v6, v8 # |j n|l p|J N|L P|.. |d h|l p|D H|L P|.. |
| 125 | +---- |
| 126 | +
|
| 127 | +==== |
0 commit comments