Sampling designs for resource efficient collection of outcome labels for machine-learning, with application to electronic medical records