[FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock. by KexinFeng · Pull Request #20559 · apache/mxnet

KexinFeng · 2021-08-27T01:33:16Z

Description

The PR adds the support for fetching the gradients of intermediate variables in a gluon HybridizedBlock. This applies uniformly to both when block.hybridize() is on and off. This generates the attach_grad implemented in implemented in PR#20500.

The motivation of this feature comes from this issue#11865.

Checklist

Essentials

PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented

Changes

block.py where mark_vars and get_mark_vars are added along with MXNDArrayMarkDCVariables.
cached_op.invoke in cpp backend and CachedOp.__call__ have been editted to include the pass of marked nonleaf ndarrays.
set_nleafs method is added into CachedOp class to store the marked nonleaf ndarrays.
Inside void RunGraph, marked nonleaf ndarrays are linked to the marked computational node for autograd computation.

Comments

This feature is built on top of PR#20500. The modification here is mainly in the invoke of CachedOp computation.

mxnet-bot · 2021-08-27T01:33:18Z

Hey @KexinFeng , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [sanity, windows-gpu, clang, edge, unix-cpu, windows-cpu, unix-gpu, centos-cpu, website, miscellaneous, centos-gpu]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

tests/python/unittest/test_autograd.py

leezu · 2021-08-27T19:08:17Z

python/mxnet/gluon/block.py

+        if not self._active:
+            raise RuntimeError('Hybridize must be active in order to use mark_vars')


For a good user experience, it would be great if we have an API that supports both hybrid and imperative mode. It doesn't need to use the same backend implementation, but it would be good to have a consistent frontend API.

Now both hybridized and unhybrized are unified. This is shown in the last two examples in unittest/test_autograd.py

python/mxnet/gluon/block.py

KexinFeng · 2021-08-28T23:50:57Z

@mxnet-bot run ci [unix-gpu]

mxnet-bot · 2021-08-28T23:51:03Z

Jenkins CI successfully triggered : [unix-gpu]

KexinFeng · 2021-08-29T05:23:49Z

@mxnet-bot run ci [website]

mxnet-bot · 2021-08-29T05:23:55Z

Jenkins CI successfully triggered : [website]

KexinFeng · 2021-08-29T15:26:17Z

@mxnet-bot run ci [website]

mxnet-bot · 2021-08-29T15:26:23Z

Jenkins CI successfully triggered : [website]

KexinFeng · 2021-08-30T15:28:28Z

@mxnet-bot run ci [linkcheck]

mxnet-bot · 2021-08-30T15:28:32Z

None of the jobs entered are supported.
Jobs entered by user: [linkcheck]
CI supported Jobs: [edge, centos-gpu, unix-cpu, sanity, clang, centos-cpu, miscellaneous, windows-cpu, windows-gpu, website, unix-gpu]

python/mxnet/gluon/block.py

KexinFeng · 2021-08-31T15:43:37Z

@mxnet-bot run ci [centos-gpu]

mxnet-bot · 2021-08-31T15:43:43Z

Jenkins CI successfully triggered : [centos-gpu]

tests/python/unittest/test_autograd.py

python/mxnet/gluon/block.py

leezu · 2021-09-02T18:32:16Z

src/imperative/imperative_utils.cc

+        int mark_id = std::stoi(it->second);
+        CHECK_LT(mark_id, nleafs.size())
+          << "Mark_id exceeds the nonleaf list size.";
+        nleafs[mark_id]->copy_autograd_entry_(ndoutputs[0]);


Have you verified this is correct if a marked variable has multiple outputs? I see your test case below only considers variables with a single output:

def forward(self, a, b): out1 = a*b out2 = out1 * a self.mark_vars(out1) return out2

I have added the unit test which considers multiple outputs. This is in the commit 'multiple_outputs', also shown below. It shows that it works as expected.

def forward(self, a, b, c): out1 = self.intermediate(('out1_0', 'out1_1'), ((a+b)*c, a*b), grad_req='write') out2 = self.intermediate('out2', out1[1] * a) return out2

In the above commented part, in line 177, indeed the use of ndoutputs[0] assumes that in the forward graph each computation node has only one output. This is consistent with line 152. Based on the test above, for now it seems that this assumption works for the multiple output case: ((a+b)*c, a*b).

KexinFeng · 2021-09-03T20:35:58Z

@mxnet-bot run ci [centos-gpu]

mxnet-bot · 2021-09-03T20:36:01Z

Jenkins CI successfully triggered : [centos-gpu]

KexinFeng requested a review from szha as a code owner August 27, 2021 01:33

mseth10 added the pr-awaiting-testing PR is reviewed and waiting CI build and test label Aug 27, 2021

KexinFeng requested a review from eric-haibin-lin as a code owner August 27, 2021 01:34

leezu reviewed Aug 27, 2021

View reviewed changes

tests/python/unittest/test_autograd.py Show resolved Hide resolved

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-review PR is waiting for code review pr-awaiting-testing PR is reviewed and waiting CI build and test labels Aug 27, 2021

leezu reviewed Aug 27, 2021

View reviewed changes

python/mxnet/gluon/block.py Outdated Show resolved Hide resolved

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Aug 27, 2021

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Aug 28, 2021

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Aug 29, 2021

mseth10 removed the pr-work-in-progress PR is still work in progress label Aug 29, 2021

barry-jin reviewed Aug 30, 2021

View reviewed changes

python/mxnet/gluon/block.py Outdated Show resolved Hide resolved

barry-jin reviewed Aug 30, 2021

View reviewed changes

python/mxnet/gluon/block.py Outdated Show resolved Hide resolved

leezu reviewed Sep 2, 2021

View reviewed changes

This was referenced Jul 6, 2022

[FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock. #21089

Closed

[FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock. #21091

Open

		if not self._active:
		raise RuntimeError('Hybridize must be active in order to use mark_vars')

Conversation

KexinFeng commented Aug 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Essentials

Changes

Comments

Uh oh!

mxnet-bot commented Aug 27, 2021

Uh oh!

Uh oh!

leezu Aug 27, 2021

Choose a reason for hiding this comment

Uh oh!

KexinFeng Aug 27, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

KexinFeng commented Aug 28, 2021

Uh oh!

mxnet-bot commented Aug 28, 2021

Uh oh!

KexinFeng commented Aug 29, 2021

Uh oh!

mxnet-bot commented Aug 29, 2021

Uh oh!

KexinFeng commented Aug 29, 2021

Uh oh!

mxnet-bot commented Aug 29, 2021

Uh oh!

KexinFeng commented Aug 30, 2021

Uh oh!

mxnet-bot commented Aug 30, 2021

Uh oh!

Uh oh!

Uh oh!

KexinFeng commented Aug 31, 2021

Uh oh!

mxnet-bot commented Aug 31, 2021

Uh oh!

Uh oh!

Uh oh!

leezu Sep 2, 2021

Choose a reason for hiding this comment

Uh oh!

KexinFeng Sep 3, 2021

Choose a reason for hiding this comment

Uh oh!

KexinFeng commented Sep 3, 2021

Uh oh!

mxnet-bot commented Sep 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

KexinFeng commented Aug 27, 2021 •

edited

Loading