Open
Description
Hi, thank you for sharing these wonderful works!
I found a problem in loading the pre-trained file 'vg-faster-rcnn.tar'.
The anchor ratios and anchor scales in neural-motifs are inconsistent with the torchvision.models.detection
motifs
anchor ratios: (0.23232838, 0.63365731, 1.28478321, 3.15089189); scales: (2.22152954, 4.12315647, 7.21692515, 12.60263013, 22.7102731)
torchvision
anchor ratios: (0.5, 1.0, 2.0); scales: (32, 64, 128, 256, 512).
Thus the pre-trained weights 'vg-faster-rcnn.tar' mismatch the torchvision in rpn.head.bbox_pred
(120, 512, 1, 1) vs (60, 512, 1, 1).
I don't know if my analysis above is correct and if this will affect the performance of rpn.
Metadata
Metadata
Assignees
Labels
No labels