Self-training (ST), or pseudo-labeling has sparked important curiosity within the automated speech recognition (ASR) group just lately due to its success in harnessing unlabeled information. Not like prior semi-supervised studying approaches that relied on iteratively regenerating pseudo-labels (PLs) from a skilled mannequin and utilizing them to coach a brand new mannequin, current state-of-the-art strategies carry out ‘steady coaching’ the place PLs are generated utilizing a really current model of the mannequin being skilled. Nonetheless, these approaches nonetheless depend on bootstrapping the ST utilizing an preliminary supervised studying part the place the mannequin is skilled on labeled information alone. We consider this has the potential for over-fitting to the labeled dataset in low useful resource settings and that ST from the beginning of coaching ought to scale back over-fitting. On this paper we present how we will do that by dynamically controlling the evolution of PLs through the coaching course of in ASR. To the perfect of our information, that is the primary examine that exhibits the feasibility of producing PLs from the very begin of the coaching. We’re capable of obtain this utilizing two strategies that keep away from instabilities which result in degenerate fashions that don’t generalize. Firstly, we management the evolution of PLs by a curriculum that makes use of the web modifications in PLs to manage the membership of the cache of PLs and enhance generalization. Secondly, we discover that by sampling transcriptions from the predictive distribution, fairly than solely utilizing the perfect transcription, we will stabilize coaching additional. With these strategies, our ST fashions match prior works with out an exterior language mannequin.