Mask R-CNN optimizer and learning rate scheduler in Pytorch

In the Mask R-CNN paper the optimizer is described as follows training on MS COCO 2014/2015 dataset for instance segmentation (I believe this is the dataset, correct me if this is wrong)

We train on 8 GPUs (so effective minibatch
size is 16) for 160k iterations, with a learning rate of
0.02 which is decreased by 10 at the 120k iteration. We
use a weight decay of 0.0001 and momentum of 0.9. With
ResNeXt (45), we train with 1 image per GPU and the same
number of iterations, with a starting learning rate of 0.01.

I’m trying to write an optimizer and learning rate scheduler in Pytorch for a similar application, to match this description.

For the optimizer I have:

def get_Mask_RCNN_Optimizer(model, learning_rate=0.02):
    optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate, momentum=0.9, weight_decay=0.0001)
    return optimizer

For the learning rate scheduler I have:

def get_MASK_RCNN_LR_Scheduler(optimizer, step_size):
    scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=step_size, gammma=0.1, verbose=True)
    return scheduler

When the authors say “decreased by 10” do they mean divide by 10? Or do they literally mean subtract by 10, in which case we have a negative learning rate, which seems odd/wrong. Any insights appreciated.

Machine Learning – How Many Object Classes Can You Virtually Detect With Faster RCNN?

I'm trying to follow this tutorial using the Faster-RCNN-Inception-V2-COCO model from the TensorFlow Model Zoo to recognize playing cards. I was wondering what the practical limit to the number of object classes I could use for detection. In particular, I would like to distinguish each letter in the English language (case-sensitive), number and letters determine mathematical symbol. Would this model work with so many different classes?

If I want to recognize some words, it would make sense to mark all the characters that make up a word, as well as the word (provided there are only a few key words that I really want to recognize).

Machine Learning – How to get back the coordinate points corresponding to the intensity points obtained from a faster r-cnn object recognition process?

Some of your previous answers have not been well received and you run the risk of being blocked from answering.

Please note the following notes exactly:

  • Please be sure answer the question, Enter details and share your research!

But avoid

  • Ask for help, clarification or answering other answers.
  • Make statements based on opinions; secure them with references or personal experiences.

For more information, see our tips for writing great answers.