Vulkan: Add SmallBufferPool for small allocations The small pool uses buddy algorithm that is much faster. The only downside is that it rounds size to power of two. For small allocations that rounding does not generate much waste and avoid fragmentation as well. This CL adds a small pool for host visible non-coherent pool. My testing with gardenscape shows that on top of other CLs, this reduces CPU overhead from 1.77ms to 1.55ms as measured with --minimize-gpu-work with offscreen. Bug: b/215768827 Change-Id: I68434931f238c4e980b77d3df46d762260ef1db5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3415211 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Ian Elliott <ianelliott@google.com> Commit-Queue: Charlie Lao <cclao@google.com>