I think this is going to depend greatly on your enviornment and the experience of those from various enviornments.
Just looking at what you wrote, I'm curious if using an initial multi-cache cache only job well in advance to predepoy all of your install files slowly might be the best way for you to start.
Then create a second push job that will run the files already in the cache.
We personally do not create multiple tasks over and over again. Instead we will link the task to a query. If you had some sort of custom inventory data linked to specific buildings or geographical locations, or even network subents, you could just update the query each time, and rerun the job on all devices that had failed to run the task. It would not re-try the succesfull ones.
We do this so that we also pick up a remediation along the way. Some times we will not do it, incase the remediation devices just keep failing and slowing things down because they are always offline or something.
The other option would just be to keep grabing a set of 500 devices and add it to your existing task.
I might just be really tired, but I am not sure why you are creating multiple tasks for your jobs. Multiple tasks I dont thinkg gives you multiple MDRs per subnet. You still only get 1 MDR per subnet, as far as I know.
Also are you using preferred servers at the various locations? This would also cut down on bandwidth across the WAN or other networks if you had several perferred servers that devices could pull from and might help in the scenario where if you had used cache only and a device failed to recieve it, then it failed a peer download it would go back to a preferred server before trying the source or core.