A use-case would be a batch processing server which runs a large number of jobs which perform calculations but then need to fire off expensive external processes or perform heavy network I/O. While for most of the time, the threads can all run in parallel, it might be desired that the expensive operation is not run by more than 10 threads at once, to avoid thrashing swap space or saturating the network. This can be accomplished with a counting semaphore:
SYMBOL: expensive-section requests 10 <semaphore> '[ ... _ [ do-expensive-stuff ] with-semaphore ... ] parallel-map

Here is a concrete example which fetches content from 5 different web sites, making no more than 2 requests at a time:
USING: concurrency.combinators concurrency.semaphores fry http.client kernel urls ; { URL" http://www.apple.com" URL" http://www.google.com" URL" http://www.ibm.com" URL" http://www.hp.com" URL" http://www.oracle.com" } 2 <semaphore> '[ _ [ http-get nip ] with-semaphore ] parallel-map