Downloads by nodes

analysis
Dec 6, 20025 mins

Kontiki's grid could solve the crisis of massive Internet file transfers

WHEN THE INTERNET bogs down on long weekends, it’s not from instant messaging, online games, or spam. The traffic that makes circuits and servers buckle is large file transfers. Yes, some of this traffic is contraband. But ripped music CDs, MPEG-encoded episodes of Friends, and pirated software account for only a fraction of the traffic. Most of these transfers are legit. Content producers want consumers to download movie trailers, news footage, and songs, and software vendors want customers to download multi-megabyte demos and patches. For example, every new release of Red Hat Linux or FreeBSD triggers a worldwide download frenzy of CD-ROM image packages totaling about 2GB each.

Furthermore, companies and academic institutions increasingly are turning to high-quality digital media for customer support, sales training, and distance learning. The challenge is figuring out how to get content to end points in a network-friendly way.

Reviled as they were, Napster and Gnutella had the seeds of a good idea: Instead of pooling all the content in one place or spreading it across several mirror sites, let the clients that have already pulled down the content serve it to the next wave of clients. Unfortunately, the Napster and Gnutella engineers focused on getting network users interconnected, and left most of the real problems (the ones business cares about) unsolved.

Consequently, users of these networks still suffer slow transfers and interrupted downloads. Security is haphazard or nonexistent, and the decentralized nature of peer networks makes monitoring impossible. The completely decentralized model has proven unsuitable for commercial applications.

Nodes know nothing

In a peer network, each node manages its own connections and transfers. The trouble with this egalitarian approach is that not all nodes are truly peers. Slow machines running on dial-up connections are not the equal of multiprocessor workstations running on digital circuits. The network must accommodate the limitations of the least capable nodes while exploiting faster, better-connected nodes to the extent their users permit it. It isn’t practical for every node to gather and analyze performance and bandwidth availability for every other node on the network. Without that knowledge, a peer can’t choose the fastest or most reliable connections.

In designing Version 2.0 of its DMS (Delivery Management System), Kontiki combines the best aspects of grid and client/server topologies. The result is a distribution network that takes the pain out of distributing large files via the Internet and across LAN segments. Everybody gets a break: Users automatically get uninterrupted downloads from the fastest available source; network administrators can throttle traffic and delay distribution to periods of minimal utilization; and nobody involved has to know anything about grids. Users needn’t be aware they’re even part of a grid. All they know is that downloads are faster and that files always come through on the first try.

Watching the flock

Kontiki DMS 2.0 injects servers into the grid to manage and secure traffic. The servers watch grid nodes and try to arrange the best available connection based on each machine’s CPU and network load. Every file transfer request queries the DMS directory server to identify the node or nodes best equipped to serve the file. That sounds like load balancing, but DMS 2.0 does the job on the fly. Kontiki’s KDM client software will switch to a faster source in mid-download if such a source appears on the grid. If a node that was serving a file falls out of the grid, DMS swaps in another node and allows the transfer to continue uninterrupted. One client can pull from multiple servers to shorten downloads and improve reliability.

The Kontiki architecture is flexible. It does not require that every node be a full participant in the grid. The administrator guides the allocation of resources. Letting peer nodes ship files to one another is more suited to a segmented LAN or branch office setting than the Internet. For Internet distribution, leeching (downloading without serving files to others) is acceptable. In that scenario, the business or organization deploys a grid of servers. It is like a farm or a collection of mirror hosts, except that load balancing and fail-over adjustments are transparent and made in real time. The file is pointed to by a single ordinary URL; users don’t choose the mirror that’s closest to them, DMS does. All of the redirection and inter-session management is handled by client nodes using information supplied by the DMS servers.

If a Kontiki client node does elect to serve files to others — that decision can be forced on users by administrators in an intranet setting — each node can throttle its total upstream bandwidth. It can also automatically duck out of the grid whenever the machine is in use. Since most desktop machines are left on during nights and weekends, they might as well throw some files around during their idle time.

Kontiki ensures that transfers are secured and logged. Users must authenticate their systems before they can join the grid. Access to content can be restricted on a per-file basis, and DMS 2.0 will consult existing LDAP servers for user credentials. Links to content management servers allow companies to inject new content into a DMS grid using familiar and trusted tools.

Kontiki DMS 2.0 is an excellent, practical application of grids, and an ingenious solution to the hassle of moving big files around networks of thousands of nodes. At this stage, DMS 2.0 is primarily a hosted solution, although there are plans to turn it into a product for easy deployment behind the firewall. Meanwhile, the company’s engineers can deploy DMS 2.0 on-site, as it has done for several high-profile customers.