Abstract. Recent studies on grammatical inference have demonstrated the benefits of “distributional learning” for learning context-free and context-sensitive languages. Distributional learning models and exploits the relation between strings and contexts in the language of the learning target. There are two main approaches. One, which we call primal, constructs nonterminals whose language is characterized by strings. The other, which we call dual, uses contexts to characterize the language of a nonterminal of the conjecture grammar. This paper demonstrates and discusses the duality of those approaches by presenting some powerful learning algorithms along the way.