Skip to content

Manifold Loss and P2SGrad Loss #635

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jun 18, 2023
Merged

Conversation

domenicoMuscill0
Copy link
Contributor

I have implemented a first version of the manifold loss introduced in the paper https://openaccess.thecvf.com/content_CVPR_2019/papers/Aziere_Ensemble_Deep_Manifold_Similarity_Learning_Using_Hard_Proxies_CVPR_2019_paper.pdf
I have also changed the function to_device of the common_functions file to accept a list or tuple of tensors.
I have found one issue that the paper does not explain: the similarity matrix S of the graph is defined by using scalar products, but if the sum of the rows is negative exponentiating to -1/2 the degree matrix D gives torch.nan
To solve this i used abs(D) instead of D.

@KevinMusgrave
Copy link
Owner

Thanks @domenicoMuscill0!

Would it be possible to write a test for this loss function? Usually for tests I compare the implementation with a more manual version of the loss function, like using for-loops instead of vectorized operations. Or I compare the implementation with the official implementation, if it exists.

@domenicoMuscill0
Copy link
Contributor Author

I have managed to get the original version of the code for the Manifold loss and i am writing the tests. I have also implemented the P2SGrad loss, written and passed some basic tests. When i wll finish (hopefully tomorrow) will it be enough to just commit on my branch dev?

@KevinMusgrave
Copy link
Owner

Yes

@KevinMusgrave
Copy link
Owner

KevinMusgrave commented Jun 7, 2023

Looks like the tests are passing now. Thanks @domenicoMuscill0!

Can you remove the commented-out code and also run ./format_code.sh?

You'll need to install black, isort, and nbqa.

@KevinMusgrave
Copy link
Owner

Also run ./run_linter.sh and see if there are any linter warnings. (This has more linter rules than the github workflow action.)

You'll need to install flake8.

@domenicoMuscill0
Copy link
Contributor Author

I have removed the commented code lines that i have left there before because they were in the original file the author (@azieren) sent me. I have left commented only the parts in which i explain the changes i made with respect to his original version.
I ran ./run_linter.sh and i got 0 warnings. Moreover i must highlight that the relative error bounds I selected are quite large, due to some noticeable issues with numerical stability that I've encountered in the original implementation. I have already contacted @azieren privately to discuss about this. I have run many experiments and i came to the conclusion that this is the only reason why the results differ so much. However i am not an expert and i am available to apport the necessary changes if it will be needed. Thank you very much @KevinMusgrave for creating this library!

@domenicoMuscill0
Copy link
Contributor Author

I have removed the commented code lines that i have left there before because they were in the original file the author (@azieren) sent me. I have left commented only the parts in which i explain the changes i made with respect to his original version. I ran ./run_linter.sh and i got 0 warnings. Moreover i must highlight that the relative error bounds I selected are quite large, due to some noticeable issues with numerical stability that I've encountered in the original implementation. I have already contacted @azieren privately to discuss about this. I have run many experiments and i came to the conclusion that this is the only reason why the results differ so much. However i am not an expert and i am available to apport the necessary changes if it will be needed. Thank you very much @KevinMusgrave for creating this library!

However the numerical instability regards only the Manifol Loss. Instead the P2SGrad loss matches both with existing implementation and with theoretical results of the gradients described in the paper.

@KevinMusgrave
Copy link
Owner

Thanks @domenicoMuscill0!

@KevinMusgrave KevinMusgrave changed the title Manifold Loss Manifold Loss and P2SGrad Loss Jun 12, 2023
@KevinMusgrave
Copy link
Owner

@domenicoMuscill0 For ManifoldLoss is there a reason for passing in labels as indices_tuple = labels, instead of labels = labels ?

@domenicoMuscill0
Copy link
Contributor Author

Those labels are just the indices of the metaclasses/clusters to which the embeddings belong. I preferred to interpret them as indices tuples because in this way it will be possible to add Manifold Loss among the supported self supervised losses in the future. These indices tuples are not strictly required, in fact in lines 61-66 in manifold_loss.py we initialize the metaclasses with random embeddings (as it is reported in the paper) if no given cluster indices are passed to the function as indices_tuple argument.

@KevinMusgrave
Copy link
Owner

@domenicoMuscill0 Ok, makes sense. Thanks for implementing these difficult loss functions!

@KevinMusgrave KevinMusgrave merged commit 5e1bbec into KevinMusgrave:dev Jun 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Manifold similarity loss P2SGrad: Refined Gradients for Optimizing Deep Face Models
2 participants