Contrastive Learning Of Visual Representations From Unlabeled Videos