I highly suspect you'll need to give the runtime a hint about how many threads it should attempt to use. Obviously the platform is pointing out (ยง8.2.6.1) that ig might need to actually run it in an inferiour number of threads for various practical limitations, but this is not limited by - for example - the number of physical CPU cores being available. Remember that since our task is extensively limited by IO latency, it makes sense to suggest to the container that we would benefit from an higher number of parallelism than the physical cores being available. Now what that number needs to be for "optimal" is something that we can't guess and needs to be a tunable which can be controlled by end users. Attempting to set "infinite" or other very high figures would not help performance at all, the reason to constrain such pipelines is to make optimal use of all resources. |