-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Op] SelectOp short-circuit evaluation #65
Comments
Outline the module {
func @select(%arg0: i32, %arg1: memref<10xi32>) -> i32 {
%c0_i32 = arith.constant 0 : i32
%0 = arith.cmpi sgt, %arg0, %c0_i32 : i32
cond_br %0, ^bb1(%arg0 : i32), ^bb2
^bb1(%1: i32): // pred: ^bb0
%2 = arith.index_cast %1 : i32 to index
%3 = affine.load %arg1[%2 - 1] : memref<10xi32>
return %3 : i32
^bb2: // pred: ^bb0
%c0_i32_0 = arith.constant 0 : i32
return %c0_i32_0 : i32
}
} |
Proposal: Use Nested
|
I think it might make sense for simplicity reasons to separate the evaluation of the condition and the true/false branches. I think you can do this with scf.yield. %ret_value = some init value
%cond1 = cmp lt %i %one
%cond = scf.if %cond1 -> (i1) {
%ele = memref.load A[%i - 1]
%cond2 = cmp gt %ele %zero
scf.yield %cond2 : i1
} else {
%false = arith.constant 0 : i1
scf.yield %false : i1
}
%res = scf.if %cond -> (out_type) {
// true branch
} else {
//false branch
} It might also make sense to define a specific hcl.select that functions almost identically to scf.if but requires the else branch. This could simplify code generation because it will separate normal if statements from hcl.selects. %ret_value = some init value
%cond1 = cmp lt %i %one
%cond = scf.if %cond1 -> (i1) {
%ele = memref.load A[%i - 1]
%cond2 = cmp gt %ele %zero
scf.yield %cond2 : i1
} else {
%false = arith.constant 0 : i1
scf.yield %false : i1
}
%res = hcl.select %cond -> (out_type) {
// true branch
} else {
//false branch
} |
@andrewb1999 Thanks for your suggestions! I think separating the condition and the true/false branches is a good idea, but I have two questions:
// sequential if
%cond_12 = scf.if %cond1 () {
scf.yield %cond2
} else {
scf.yield %false
}
%cond_123 = scf.if %cond12 () {
scf.yield %cond3
} else {
scf.yield %false
}
// nested if
%cond = scf.if %cond1 () {
%cond_23 = scf.if %cond2 () {
scf.yield %cond3
} else {
scf.yield %false
}
scf.yield %cond_23
} else {
scf.yield %false
}
|
%cond = hcl.and {
hcl.yield %cond1
} and {
hcl.yield %cond2
} and {
hcl.yield %cond3
} This should be possible using the VariadicRegion class in mlir tablegen (to support an arbitrary number of and regions). We can then imagine semantics of this new operation to be short-circuting in the same way as && in C. We can also nest conditions in a fairly natural way. When generating HLS code, it is a direct translation to cond1 && cond2 && cond3. When generating LLVM, we can translate it to a nested if as described above, or directly to basic blocks that represent short-circuiting, in the same way we could represent short-circuiting in LLVM itself. Obviously this form of implementation requires more work in tablegen, but to me this is the whole point of using MLIR. It seems worth it to leverage MLIRs features to provide a better high level representation that we can translate to LLVM later.
|
Right, I agree with that. Basically, we want to retain as much information as we can in MLIR. Those |
I added Next step I'll implement their lowering passes for LLVM backend. |
The current codegen for SelectOp does not support short-circuit evaluation, which may cause problems in some specific cases. For example, for
hcl.select(i > 0, A[i - 1], 0)
, we need to make surei > 0
first and then loadA[i - 1]
. However, MLIR has a strict SSA form that does not allow us to write a single-line code but evaluate the true and false branches first. The following code shows this situation, which leads to out-of-bounds access of arrayA
wheni = 0
.General
scf.if
statement may also have this problem.Update: The above example is not the short-circuit one but is also related to the evaluation order. The short-circuit one is the below example, which is also not supported by our flow.
The text was updated successfully, but these errors were encountered: