A rchive Date
[ 22-04-2001 ]
Category
[ Information Technologies ]
sub-Categoy
[ Microsoft ]
|
[http://www.zdnet.com/enterprise/stories/main/0,10228,2678024,00.html
Building a Windows 2000 Distributed File System
Brien M. Posey, ZDNet Business & Technology
January 24, 2001 1:54 PM ET
If you've been working with networks for any length of time, you've no doubt had a server's hard disk fill up, which usually means it's time to have users clear out files they no longer use. You can move seldom-used files to offline storage, but in Windows 2000, you have another option when server hard drives begin to fill up: you can build a Distributed File System (DFS) tree.
DFS makes files and directories that are scattered across multiple servers appear as though they exist on the same server. For example, suppose your users need to access files from share points on two different servers. You could make life easier for them by creating a DFS tree that includes the two different share points. Upon your doing so, the users could access the DFS tree as a single share point, and would no longer have to know which server or share point actually hosted the files.
There's no question that implementing a DFS tree can make users' lives easier in a large organization. However, in many organizations, making users' lives easier is a low priority. Often times, administrators are simply too busy to implement something new unless it has a clear benefit to the network's performance or security. I'm happy to say that implementing a DFS tree can increase your network's security, reliability, and performance. Read on to learn how.
Enhanced security, reliability, and performance
First, let's look at security. As I mentioned earlier, a DFS tree combines individual share points into a tree structure that users can navigate as though the shares existed on a single server. These share points are only combined logically — each share point remains an individual share point on a specific server as well. DFS requires that any share point that you make a part of a DFS tree use NTFS security. Therefore, any security you've assigned to an individual share point through NTFS still applies whether the share is accessed individually or through the DFS tree.
OK, so the existing security still applies, but how does DFS increase security? Remember that from a user's standpoint it appears that all of the files and folders in the DFS tree exist on a single server. This camouflage makes it more difficult for a hacker to track down a file's actual location. Sure, a hacker may still explore the network to find a desired folder or file, but the longer he takes to get the job done, the greater the risk of his being caught. Many hackers get inside information from someone at the company. Since your users won't actually know which server the share point files exist on, a disgruntled employee won't be able to tell someone outside of the organization where a file or folder actually exists.
Now, let's look at the issues of security and fault tolerance. The Windows 2000 implementation of DFS allows you to make use of replicas. In the case of a DFS tree, a replica is a copy of a folder that exists on a separate server that's designed to stay synchronized with the original. Therefore, it's possible to have two identical folders on different servers.
Once you've established replicas, you can use them when you need to take a server offline. Suppose you've got a folder that users access constantly, and that lives on a server you need to take offline for routine maintenance. Before taking the server down, you could point the DFS tree to the folder's replica instead of to the original folder. Doing so gives users the illusion that the server is still online, because the files contained on it are still available in the same location as always. When the downed server becomes available, the replicas synchronize, and you can then point the DFS tree back to the original folder.
Replicas can also be used to enhance performance. As you probably know, a server's performance suffers if too many people are trying to access a single share point. To compensate, you can make a heavily used folder and its replica available simultaneously, thus implementing a sort of network load balancing. Users still access the folder through the DFS tree, but the DFS tree connects them to the copy of the folder that's closest to them, thus distributing the load across two separate servers and increasing performance. For really busy share points, you can distribute the load across as many as 32 separate servers.
If I've sold you on the benefits of DFS, then it's time to start setting up a DFS tree.
Setting up DFS
There are a couple of things you need to know about DFS before you begin working with it. First of all, the Windows 2000 version of DFS can be implemented in two different ways. You can install it on standalone servers or on servers that are members of an Active Directory domain. Standalone DFS implementations have several limitations, such as not being able to access an Active Directory (obviously) and therefore not being able to work with replicas, though some organizations might implement them to make more storage capacity available. Therefore, I strongly recommend building an Active Directory-based DFS tree. The rest of this article assumes that's your plan.
Before clients can access a DFS tree, they must be running DFS client software. If your clients are running Windows 2000 Professional or Windows NT 4 Workstation (service pack 3 or higher) then you don't need any special software. Although I've been unable to confirm it, I've heard this is also the case with Windows Millennium Edition. Windows 98 Second Edition includes a DFS client that can be used with the Windows NT 4 version of DFS or for a standalone DFS tree. You can download Active Directory-based DFS clients (also called domain clients) for Windows 95 and Windows 98. No other operating systems support clients for DFS.
With that said, let's walk through the setup process. Begin by selecting Programs * Administrative Tools * Distributed File System from the Start menu to load the Microsoft Management Console and the Distributed File System console. Next, select the New DFS Root command from the Action menu. If you were going to work with a previously existing DFS root, you'd use the Display An Existing DFS Root option instead.
Windows launches the New DFS Root Wizard. The first screen after the introduction screen gives you the choice of creating a domain DFS root or a standalone DFS root. Since we're working in an Active Directory environment, select the Create A Domain DFS Root option.
The next screens ask for the name of the host domain to be used for the DFS root, the DNS name for the server that will host the DFS root, and whether you want to use an existing share for the DFS root or create a new share point. You then enter a name for the DFS root, and optionally a comment to describe the DFS root's function. Click Next followed by Finish to complete the wizard.
Congratulations — you've created a DFS root. You can now begin adding share points to it. After all, unless you add additional shares, the DFS Root is just another share point.
To add a share point, select the DFS root to which you want to add the share point, then select the New DFS Link command from the Action menu to bring up a Create A New DFS Link dialog box. Fill in the link name, the share point, and a comment if you like. You can also set a client cache referral time, which controls how long clients see the link as valid without checking to see if it really is valid. The longer the setting, the less network traffic is generated. However, long settings can cause service interruptions should you switch the place to which the link points, as you might do if you needed to take the server down for maintenance.
Now that you know how to set up a basic DFS tree, you can begin to expand on your knowledge. For example, you might begin implementing load balancing into your DFS system. If that interests you, let me know, and I may write a future article on load balancing and DFS.
Building a Windows 2000 Distributed File System." In it, I talked about how a distributed file system (DFS) can make files and directories that are scattered across multiple servers appear as though they exist on the same server. For example, if your users need to access files from share points on two different servers, you can create one DFS tree that includes the two different share points. Users can then access the DFS tree as a single share point without having to know which server or share point the files actually exist on.
The problem with implementing DFS in a large organization is that you can find an excessive number of users trying to access files through a single share point. The more users accessing a set of files, the slower the access to them becomes. This is where DFS load balancing comes into play.
You can build a DFS tree and copy the tree's files and directories to other servers. By doing so, you can have two or more identical copies of your entire DFS tree. In such an arrangement, the original DFS tree is known as the master and the copies are known as replicas. You can have up to 32 replicas.
How does Windows 2000 determine which replica a client should connect to? The server selection occurs at the client end. During the server selection process, clients use a list stored in the Active Directory to randomly select a replica. Because the server selection is a random process, client connections are somewhat evenly distributed among DFS servers. However, random server selection also has its disadvantages.
There's no way for a client to see the number of other client sessions attached to a given DFS server. This makes true load balancing impossible. If a client running Windows 2000 Professional attaches to a DFS server that appears to be overloaded, there's a way to select a different server. You can use the DFSUTIL command to flush the client's partition knowledge table (PKT), which forces the client to request a new DFS server referral. Of course, there's always the chance the client could select the same DFS server again. If the client is running a version of Windows other than Windows 2000, the client has to reboot to change DFS servers.
On the client end, the PKT is the key to the load balancing process. Because the process works differently for different types of clients, the remainder of this article assumes that clients are running Windows 2000 Professional.
The PKT functions as a cache, storing information about available links and servers. Any time a client tries to access something from the DFS, it checks the cache for the resources path. If the cache doesn't contain an entry, the client uses the server to search for the desired resource. When the client finds the resource, it adds an entry to the list.
The content in the PKT cache has a five-minute life span (unless the time to live is modified by an administrator). If a cached object hasn't been used in the last five minutes, it is removed from the cache, and the client is forced to attach to a different server to access a replica of the data.
If an entire server goes dead, such as during a power failure, it takes clients a few moments to realize the server is no longer available. Keep in mind that the other DFS Servers still show the failed server as being a valid replica, and entries pointing to the failed server may still exist in the client's PKT cache. What usually happens in such a situation is that clients attempt to communicate with the failed server, because they don't know it has dropped offline. However, because TCP/IP is designed to retry failed communications several times, it may take a few minutes before clients realize that communications are failing. At that point, a client will check its PKT cache for a new replica. If the cache doesn't contain any entries for other replicas, the client will consult the DFS root to find the name of another replica.
When a server fails, you can save yourself a lot of trouble by removing it from the list of replicas on the DFS root. To do so, open the Distributed File System console from the Administrative tools menu, right-click on the DFS root or on the DFS link, and select the Replication Policy command from the resulting context menu to invoke the Replication Policy dialog box. Select the failed server from the list and click the Disable button followed by the OK button.
Information about enabled DFS replicas is stored in the Active Directory, so it may take some time for changes you make to propagate to other domain controllers.
In the event of a partial server failure, such as a faulty hard disk, the server hosting the shared resource can report back to the client, telling it the resource is unavailable. The client can then look for a functional replica. Crash recovery is almost instantaneous unless a client has a file open during the crash. In this situation, the DFS failover process works the same way, but it's up to each application to detect the server change and establish new file locks.
Not only is a replica handy in a crash situation, but it also makes life easier when you have to perform server maintenance. For example, installing a new service pack often requires removing users from the server or performing the installation at night or over the weekend (assuming your company isn't a 24-7 operation). DFS makes this process more convenient. If you've created a replica of your DFS tree, you can simply make the replica unavailable for a period of time while you upgrade the servers. When you're done, you can make the replica available and then make the original unavailable while you tend to its servers. Users might experience slower response times while you're working on the servers, but they'll be able to keep working, and you won't have to lose sleep to perform system maintenance.
Implementing DFS with load balancing can be expensive, especially if you lack the necessary hardware and need to buy more servers to host the replicas. However, distributing file access creates a tremendous performance boost, so administrators should weigh the throughput improvement against the extra hardware expense.
Now that you understand the benefits of DFS replicas, let's turn to the process of creating them.
Replicating the DFS root
Before you begin creating replicas, consider starting with the DFS root server, which is the server through which users access all DFS links. If the DFS root server goes down, users won't be able to access any of the DFS links through the usual method of attaching to the share point; they would have to know the exact server and path where a file resides.
The best way to avoid this problem is to replicate the DFS root. To do so, open the Distributed File System console from the Administrative Tools menu. When the console opens, right-click on the existing DFS root and select New Root Replica to launch the New DFS Root Wizard.
Enter the DNS name of the server that should host the new replica. You can use the Browse function to find the desired server. Once you've selected the server, click Next and you'll see a screen that asks which share point you want to use to store the replica. You can use an existing share point or create a new share. Once you've made your selection, click Finish to complete the wizard.
Although you've created an original and a replica of the DFS tree, Windows won't replicate data between them until you tell it to do so, so your next step is to configure the way that replication occurs. From the DFS console right-click on the DFS root and select Replication Policy. Select the DFS root that you want to make the original and click the Set Master button. When you do, you'll see the replication status for that root change from No to Yes (Primary). Now, enable replication by selecting the replica and clicking the Enable button. Doing so activates automatic replication, which will occur every 15 minutes. It's possible to disable automatic replication, but this only makes sense if you're performing infrequent manual replications.
Replicating shared folders
Now that you've replicated the DFS root, you can turn your attention to other folders. The idea behind replicating shared folders is to replicate the critical data, but not the DFS root. For example, suppose that a folder called DATA existed on server A. You could create a replica of the DATA folder on server B so that servers A and B both have copies of the data.
The process for replicating a share is almost identical to the process of replicating a DFS root. From the DFS console, right-click on the DFS link that points to the share and select the New Replica command from the resulting context menu. In the Add Replica dialog box enter the UNC (Universal Naming Convention) path to a share that you want to contain the replica, then choose automatic or manual replication.
Brien M. Posey is an MCSE and freelance writer. He has been Director of Information Systems for a national chain of health care facilities and a network engineer for the Department of Defense. Because of the extremely high volume of e-mail that Brien receives, it's impossible for him to respond to every message, although he does read them all. ]
|