It's like putting another I/O specific cache layer that supports plenty operations including batch between the kernel space and the user space, so that the transitions between the kernel mode and the user mode can be largely reduced. Relying on a bunch of magic, it increases the throughput of I/O operations.