I have a request where I need to return a ~100MB file. I want to:
1) do this efficiently, and
2) limit concurrent downloads by IP (ie reject a request if concurrent downloads by that IP == x).

I dispatch to a worker thread, then read from the file using FileInputStream and write to exchange.getOutputStream(). However, I noticed that this writes all the bytes immediately. Digging into why, exchange.getOutputStream() uses UndertowOutputStream which appears to buffer all the data given to it. Not only does this seem inefficient to copy 100MB for every request, it means I don't know when the transfer completes so I can't track concurrent downloads.

Next I tried getResponseChannel, eg:
StreamSinkChannel output = exchange.getResponseChannel();
try (var input = new FileInputStream(file)) {
     Channels.transferBlocking(output, input.getChannel(), position, count);
}

Digging through the code it doesn't seem to be buffering, but it still completes immediately. For example, I download the file at 10 bytes/sec:
curl --limit-rate 10 --output file https://example.com/my/file
How can transferBlocking finish right away when this slow transfer will take ages? What is buffering the 100MB?

How can my worker thread write response data one buffer at a time, blocking as needed for the client to receive the data, until the client has all the data? This would allow me to know when the transfer is complete so I can track concurrent downloads by IP.

Thanks for your help!
-Nate