The ability to perceptually separate acoustic sources and focus one's attention on a single source at a time is essential for our ability to use acoustic information. In this study, a physiologically inspired model of human auditory processing [M. L. Jepsen and T. Dau, J. Acoust. Soc. Am. 124, 422-438, (2008)] was used as a front end of a model for auditory stream segregation. A temporal coherence analysis [M. Elhilali, C. Ling, C. Micheyl, A. J. Oxenham and S. Shamma, Neuron. 61, 317-329, (2009)] was applied at the output of the preprocessing, using the coherence across tonotopic channels to group activity across frequency. Using this approach, the described model is able to quantitatively account for classical streaming phenomena relying on frequency separation and tone presentation rate, such as the temporal coherence boundary and the fission boundary [L. P. A. S. van Noorden, doctoral dissertation, Institute for Perception Research, Eindhoven, NL, (1975)]. The same model also accounts for the perceptual grouping of distant spectral components in the case of synchronous presentation. The most essential components of the front-end and back-end processing in the framework of the presented model are analysed and future perspectives discussed.