Hey Matt,
I think you're giving too much bonus to the original authors of that test method and their well-meaning intentions ;) IMO the test is simply incorrect, for sure the intention was not that a single Order should implicitly be handled as a single-element Set<Order>.
The reason it doesn't fail with the RI is simply the fact that the RI is actually doing what the test seeks to ensure: @Valid isn't applied by validateValue(), i.e. the algorithm never gets to the point where it'd make use of the given value to recursively validate the constraints of the "orders" property. Were validateValue() be invoked for another property with a local constraint, a ClassCastException would occur when trying to feed the non-matching value to the ConstraintValidator.
Can you open a BVTCK issue for fixing this test? A Set<Order> containing the Order object should be passed to validateValue().
Regarding the expected behaviour, indeed the spec doesn't mandate something specifically. I don't think it's a big problem, though, because the non-matching value will cause an exception in one or another way anyways. Theoretically we could mandate a specific IllegalArgumentException in that case, but that'd require an instanceof check up front, and I don't think I wanted to impose this to implementations. I'd rather leave it as is; of course an implementation is free to do such check, as Apache Bval was doing it.
As far as ExecutableValidator is concerned, there it says to raise an exception if "parameters don't match with each other". This leaves some leeway on the exact checks to implement and also here I'd prefer to keep it that way, instead of enforcing potentially costly checks at the spec level.
--Gunnar