Skip to content

Conversation

@ranman
Copy link

@ranman ranman commented Aug 26, 2020

Fixes #41966

Raises a value error if user attempts to create SparseAdam optimizer with sparse parameter tensors.

@zou3519 zou3519 self-requested a review August 26, 2020 22:33
@dr-ci
Copy link

dr-ci bot commented Aug 26, 2020

💊 CI failures summary and remediations

As of commit 46c4279 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 5 times.

Comment on lines 362 to 363
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"params" should be a list, so the correct usage is:
optim.SparseAdam([{"params": [torch.zeros(3, layout=torch.sparse_coo)]}])

The documentation for that is here: https://pytorch.org/docs/master/optim.html#per-parameter-options

Comment on lines 37 to 38
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get a parameter dictionary, then param should be a List of Tensors. So instead of checking if param is sparse, we want to check if any of the tensors in param is sparse.

Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case for if the user passes in per-parameter options seems to be wrong (see https://pytorch.org/docs/master/optim.html#per-parameter-options) but the general approach here looks good to me

@ranman ranman force-pushed the bugfix/41966-SparseAdam-error-message branch from fd5b315 to 97fc86c Compare August 27, 2020 20:25
@ranman ranman force-pushed the bugfix/41966-SparseAdam-error-message branch from 97fc86c to 46c4279 Compare August 27, 2020 20:30
@ranman
Copy link
Author

ranman commented Aug 27, 2020

@zou3519

Can you think of a better way than nested iterating over all the params if it's a dictionary?

sparse_params = []
for index, param in enumerate(params):
    if isinstance(param, dict):
        for d_index, d_param in enumerate(param.get("params", [])):
            if d_param.is_sparse:
                sparse_params.append([index, d_index])
    elif param.is_sparse:
        sparse_params.append(index)
if sparse_params:
    raise ValueError(
        f"Sparse params at indices {sparse_params}: SparseAdam requires dense parameter tensors"
    )

Would it make sense to let the optimizer parse the params and then check param groups after the super call? would still involve a nested check of is_sparse on all the parameters.

e.g.

for group in self.param_groups:
    for param in group['params']:
        if param.is_sparse:
            # do something

I think I prefer the first approach since it lets a user know where in their input the error is coming from.

@zou3519
Copy link
Contributor

zou3519 commented Aug 27, 2020

@ranman

There are some pros and cons to each approach. For the first approach:

  • (pro): We tell the user exactly where their sparse tensor is
  • (con): more code
  • (con): Not as robust. In the future, hypothetically, we could change the way inputs are passed in. If we ever did that we would have to modify this code because it assumes things about how inputs are passed in. However, there are probably a lot more things one would have to change if we changed how inputs are passed in.

For the second approach:

  • (pro) more robust
  • (pro) less code
  • (con) Error message doesn't tell the user exactly where they have a sparse tensor, so they'll have to do some debugging.

I don't have a real preference, so let's go for the first approach. It might look weird but at least we're giving the user a nice error message :)

@codecov
Copy link

codecov bot commented Aug 27, 2020

Codecov Report

Merging #43668 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #43668   +/-   ##
=======================================
  Coverage   69.34%   69.35%           
=======================================
  Files         378      378           
  Lines       46680    46690   +10     
=======================================
+ Hits        32369    32380   +11     
+ Misses      14311    14310    -1     
Impacted Files Coverage Δ
torch/optim/sparse_adam.py 91.42% <100.00%> (+1.42%) ⬆️
torch/testing/_internal/expecttest.py 78.57% <0.00%> (+1.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cdc3e23...46c4279. Read the comment docs.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ranman has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ranman merged this pull request in 24eea36.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SparseAdam tries to call is_contiguous and fails

4 participants