Peer to Peer VDI? Why not?

This is really a blog post that is one long question…

Why are there no peer-to-peer VDI solutions? (or are there, and I haven’t found them)?

There are some big issues with large-scale Virtual Desktop Infrastructures. The biggest of which I know is the so-called “boot storm”, where everyone is requesting their images at once. This causes stress to the link from the desktops to the server and on the server’s IO channels.

There are a few workarounds that people use, such as staggering boot times or throwing more spindles/SSDs at the problem, but I’ve never heard of a solution where hosts in the same classroom could disseminate the image to their peers.

It wouldn’t be hard to verify security – it could be verified by a centrally-distributed hash. I/O at the central server wouldn’t suffer NEARLY as much, since it would only need to fullfill the request once per logical grouping of clients, which could then seed themselves similar to torrents.

Am I missing something big, or is there just nothing like this? If it’s a good idea and no one has implemented it, feel free to give me full credit ;-)

  • I understand what server side disk I/O contetion and load issues under VDI “boot storm” conditions are but not having deployed VDI myself, I am not familiar with what kind of stress is involved between the client and server during boot/login events? Surely a torrent like approach for distributing data to clients would only be useful if each client actually held a copy of some data another client needed during boot/login which under my basic understanding of VDI, is not the case. Please correct me if and where I am wrong.

  • Ryan: I don’t run it either.

    My understanding is that there is a large amount of I/O contention when large numbers of clients boot in a short period of time, which is common in the mornings, or I suppose at a class change, in some cases.

    It seemed like it might be useful to offload as much as possible of the initial disk requests. If the instances which had already booted (and thus have a local copy of the information) could send it to their peers, rather than relying on the centralized overworked disk array, then I would think it would go faster.

    Of course, my basic understanding of VDI may be flawed as well.

  • Matt: My understanding is that the I/O contention occurs on the server/SAN side where new VMs have to quickly be spun up simultaneously during morning login times. The actual thin client OS has already booted and most likely from a tiny local storage with a small image just enough to run the VMware View client, PCoIP client or what not.

    With the LTSP thin clients I administer, they definitely boot over the network but I was under the impression that Wyse and other name brand thin clients often boot proprietary operating systems from local media. However, that does bring up a good point: How many LTSP thin clients could I boot over the network simultaneously… I suppose that likely depends on the server side pipe, disk and if the client root is NFS or NBD. I wonder if BitTorrent PXE booting would prove useful at a large enough scale. Hmmmm!

  • I’ve never used virtualization at the desktop layer, but my guess at why there’s no current solution is security. If I have a locally cached copy of a Linux VDI, what’s to say I couldn’t modify /etc/shadow within my local copy of the VDI and then wait for it to distribute?

  • A ‘boot storm’ is just a different way of saying ‘booting lots of VMs at once’ (although it doesn’t always mean ‘booting’, lots of people loading roaming profiles would have a simlier effect).
    There should be very little communication between the client and the server other than the chosen remote control protocol (ICA, RDP etc..) which is negliable for most infrastructures.
    The highest contention is usually seen between the hypervisor and the storage which is why throwing cache modules/solid-state drives into the mix helps things along.
    The only time I can see a peer-2-peer setup helping is if you were running clientside hypervisors or ‘streaming’ the applications to the client, other than that there should be very little traffic out on the edge.
    I believe VMWare is moving towards ‘distributed storage’; it’ll be interesting to see how this works out for large VDI installations. Lots of small hosts with fast lumps of DAS could see a shift away from a SAN infrastructure…..

  • Matt, the crux of peer-to-peer is distribution…decentralizing data and computing; harnessing the power of low cost and underutilized computers; and moving data to the edge for faster delivery/performance.

    VDI is all about centralization (of data, OS, apps, and processing) and the business drivers are IT security, control and management. While there are pros to VDI, it’s extremely costly to deploy and maintain and doesn’t deliver (on whole) a good end user experience. Boot storms are one cause of that.

    The “something big” you are missing is the opposite approach to VDI, which is Intelligent Desktop Virtualization (IDV). Like peer-to-peer, it involves distributed execution by utilizing a Type-1 hypervisor installed on the PC hardware — under the OS. Everything stays local, except the lightweight golden desktop image, which is centrally managed. No boot storms. No network strain. No latency issues.

  • KC Marshall

    How about LANTorent from “Nimbus”?

    LANTorrent is a file distribution protocol integrated into the Nimbus IaaS toolkit. It works as a means to multi-cast virtual machine images to many backend nodes. The protocol is optimized for propagating virtual machine images (typically large files) from a central repository across a LAN to many virtual machine monitor nodes

  • Flemming Jacobsen

    Multicasting might be the answer?. Provided all the VDI’s are exact?