Early Natural Language Understanding systems parsed natural language with hand-crafted rules to explicit semantic representations. These systems are generally limited to the situations imagined by the designer, and much of the development work involves writing more rules. These systems are brittle, and progress is slow leading to narrow-domain applications. Statistical approaches can offer a more forgiving path by learning implicit trade-offs, generalizations, and robust behaviors from data. Deep Learning systems avoid using hand-crafted explicit representations, by learning to map to and from natural language via implicit internal vector representations. In this talk I will show our efforts in using Deep Learning systems in conversation modeling. We develop an end-to-end system that models written conversations. Such task is hard because not only we need to learn language but also must learn to do something useful with it. Moreover, the learning process requires huge amounts of data, and lots of helpful users to guide the development through live interactions. We currently focus on the subproblem of response suggestion in conversations. We frame natural language response suggestion as a search problem avoiding the need for any natural language generation steps. Inputs are matched with potential responses through a learned semantic metric space. Adding deep layers and delaying combination between input and responses encourages the network to derive implicit semantic representations of the input and responses. We precompute a minimal hierarchy of deep feed-forward networks for all potential responses, and at runtime propagate only the input through the hierarchical network. We use an efficient nearest-neighbor search of the hierarchical embeddings of the responses to find the best suggestions. We study our modeling effort in the context of natural language response suggestion in two domains: Inbox and Reddit.