My initial feeling was that we should only support SEQUENTIAL, even removing that Executor for sake of simplicity.
But your proposal makes probably more sense; the complexity for the user to workaround HSEARCH-598 lies in the fact that it's relatively complex to figure out how many connections are going to be needed for each root type; by capping the amount of parallel work to a known constant this gets much better. Seems a great idea to me, as there are many intermediate options between SEQUENTIAL AND PARALLEL.
My only doubt is if there is a practical benefit for running any of them in parallel; FWIW when I designed this I was a power user and had a complex application needing it, but I remember we would run it in parallel but it was giving only a very small benefit.. I wonder now if it wouldn't be equally (more) efficient to run only the optimisation steps in parallel with next-block indexing. I don't have access to that system to run benchmarks anymore, would you be able to verify if a threadpoolSize=2 is better than =1 in your case? I doubt it but it's hard to predict..
|