-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Activity Analysis #455
Comments
Based on our previous discussion in #452, I feel like supporting inactive arguments might be slightly simpler than we anticipated. We could decide that every argument which is not a |
Yeah, I think you're right that we can ensure that the behaviour is to error. For example, the rule for In terms of design, I suspect that we would be better off having a wrapper type that explicitly states that a value is inactive, otherwise we risk issues inside nested AD (I think). This is a detail we can figure out though. Note that adding this kind of functionality would make it possible to resolve #412 . The various SpecialFunctions.jl rules that we cannot currently support all involve read-only data (always floats I believe) which is typically inactive (this is the case that Zygote can support). It would be very nice to have this resolved. |
One slightly broader question about inactive wrapper types: can we use it to detect and track loop-invariants (e.g. array refs and induction variables, when they don't contribute to gradients, which are common) discussed in #156 and then skip pushing these inactive variables to block stacks? |
I'm not sure that we can do this in general, but it will definitely have an effect on the what we need to shove in adjoints. This will be operation-dependent, so lets consider a couple of examples. The rule for function rrule!!(::CoDual{typeof(*)}, _x::CoDual{Float64}, _y::CoDual{Float64})
x = primal(_x)
y = primal(_y)
z = x * y
mul_adjoint(dz::Float64) = NoRData(), dz * y, dz * x
return CoDual(z, NoFData(), mul_adjoint
end Importantly, both Suppose that function rrule!!(::CoDual{typeof(*)}, _x::CoDual{Float64}, _y::Inactive{Float64})
x = primal(_x)
y = primal(_y)
z = x * y
mul_adjoint(dz::Float64) = NoRData(), dz * y, GradNotRequired()
return CoDual(z, NoFData(), mul_adjoint
end The size of the adjoint is now only 8B, so the total amount of memory used is halved. This is probably quite valuable if this operation is called inside a large loop. Of course, if both Similarly, the adjoint returned by the rule for Is this the kind of thing that you had in mind @yebai ? |
#452 highlighted a situation in which our lack of activity analysis can cause substantial slow-downs in performance. This purpose of this issue is to sketch out how activity analysis might be implemented in Mooncake.
Todo:
The text was updated successfully, but these errors were encountered: