DOCUMENTATION

dvd::rip - Copyright © Jörn Reder, All Rights Reserved

3. Cluster mode




3.1 Additional installation steps

Content ] [ Top ]

First make sure you have all Perl modules and programs installed, which are needed for the cluster mode. Refer to the corresponding sections in the installation and dependency check chapters.


3.2 Restrictions

Content ] [ Top ]

Please note that the cluster mode currently has some restrictions:

  1. (S)VCD isn't supported.
  2. Chapter mode isn't supported.
  3. You can't use PSU core.
  4. Title needs to be copied on harddisk, no on-the-fly or DVD image transcoding possible.


3.3 Architecture overview

Content ] [ Top ]

A dvd::rip cluster consists of the following components:

  1. A computer with a full dvd::rip and transcode installation, DVD access and local storage or access to a NFS server, where all files are stored.
  2. A computer with a dvd::rip installation, but no GUI access and no transcode installation, where the cluster control daemon runs on. This may be the same computer as noted under 1 (which is usually the case).
  3. An arbitrary number of computers with a full transcode installation, dvd::rip is not necessary here. These are the transcode nodes of the cluster.
  4. The GUI dvd::rip computer and the transcode nodes must all have access to the project directory, shared via NFS or something similar. It doesn't make any difference which computer of the network is the NFS server.
  5. The communication between the cluster control daemon and the transcode nodes is done via ssh. All transcode commands are calculated by the cluster control daemon and executed via ssh on the transcode nodes. dvd::rip assumes, that the cluster control computer has user key authentication based access to the nodes. That means, that no password need to be given interactively.

This may be looking confusing, but in fact all the different services described here, can be distributed in arbitrary ways on your hardware. You can even use the cluster mode with one computer, which runs all services: dvd::rip GUI, cluster control daemon, transcode node (naturally using local data access). In this case you "misuse" the cluster mode as a comfortable job controller, which is in fact a regular use case, because dvd::rip has no specific job features besides this.

A typical two-node installation may look like this:

First computer runs services
  1. dvd:rip GUI
  2. dvd::rip cluster control daemon
  3. transcode node with local storage access
  4. NFS server

Second computer runs services
  1. transcode node with NFS access to the project data


3.3.1 Security warning

Currently this cluster architecture has some serious security issues. Once you setup your cluster it's really easy using it, because there are no password prompts or similar access restrictions. The user key based ssh authentication enables everyone who have access to your cluster control daemon computer logging on the nodes, without having a password at all. This architecture has a small home network in mind, where these drawbacks are not relevant. If you think this is a real problem, you should consider creating special accounts for the dvd::rip cluster access, which are restricted to executing the transcode commands only. Or you should consider, not using the cluster mode of dvd::rip at all ;)


3.4 Network configuration

Content ] [ Top ]


3.4.1 Setup SSH

First you have to setup a proper ssh based communication between the cluster control daemon computer and the transcode nodes. There must be no interactive password authentification, because the cluster control daemon must be able to execute the commands without user interaction.

Please refer to your ssh documentation for details. This is a brief description of setting up a user key authentication for ssh and OpenSSH.

Login as the user who will run the cluster control daemon (on the corresponding computer) and check if this user has a public key:

  ls -l ~/.ssh/identity.pub
If the file is not present execute this command:
  ssh-keygen
and follow its instructions but press enter if you are asked for a password!

Now add the content of your ~/.ssh/identity.pub file to the ~/.ssh/authorized_keys file on each transcode node. After this you should be able to login from the cluster control computer to the node without being prompted for a password. If not, try 'ssh -v' to see, what's going wrong.


3.4.1.1 Hints for OpenSSH with SSH2 protocol

The steps documented above work with commercial ssh and OpenSSH as well, but only if the SSH1 protocol is accepted by the server. If you use OpenSSH and your server insists on SSH2, follow these instructions:

Generate your key using this command:

  ssh-keygen -t rsa

Again provide an empty password for your key. Now you should have a file ~/.ssh/rsa.pub. Add the content of this file to the ~/.ssh/authorized_keys and it should work with SSH2 as well.


3.4.1.2 OpenSSH can't find transcode binaries

Another common problem with OpenSSH is, that the transcode binaries can't be found if they're executed via ssh. Often /usr/local/bin isn't listed in the default PATH for ssh connections, but by default transcode installs its binaries there, so they aren't found.

The solution for this problem is adding /usr/local/bin to the ssh PATH using the ~/.ssh/environment file. Just put this line into ~/.ssh/environment on the node and all binaries should be found:

  PATH=/usr/local/bin:/bin:/usr/bin
or whatever the bin path of your transcode installation is. Don't use any quotes in this line. (Thanks go to Douglas Bollinger for his hint regarding the PATH problem).


3.4.2 Setup NFS

Note:
The cluster mode has currently one restriction regarding the directory layout of your data directory. You must keep the default values for vob/avi/temp directories and must not set a base directory outside of your default data base directory which is configured in the global preferences dialog. Otherwise NFS node data access will fail badly.

All nodes must have access to the project data base directory, usually using NFS. You must export the project data base directory and mount this directory on each node. Later you'll see how to specify the specific mount point for each node.

This is an example configuration: on my workstation (called wizard) I have a big hard disk mounted on /mega which holds all my dvd::rip projects, so I exported this:

  # cat /etc/exports:
  /mega *(rw)  
On my notebook I mount this directory to /hosts/wizard/mega by specifing this entry in the /etc/fstab
  wizard:/mega     /hosts/wizard/mega     nfs


3.5 Start cluster control daemon

Content ] [ Top ]

Now start the cluster control daemon by entering this command:

  dvdrip-master 2
The 2 is the logging level. 2 ist Ok for most cases, increase this value if you want more debugging output (which is simply printed to STDERR).

You should get an output similar to this:

  Sun Feb  3 13:55:43 2002 Master daemon activated
  Sun Feb  3 13:55:43 2002 dvd::rip cluster control daemon started
  Sun Feb  3 13:55:43 2002 Started rpc listener on TCP port 28646
  Sun Feb  3 13:55:43 2002 Started log listener on TCP port 28656

You can ommit starting the daemon by hand, then the dvd::rip GUI will start a daemon for you in background, locally on the machine running the GUI, with logging level 2. In fact you only need to start the cluster control daemon by hand, if you want to pass a higher debugging level or you want to start the daemon on another machine. It's your choice.


3.6 Cluster configuration

Content ] [ Top ]

Now, when SSH and NFS communication are set up properly, dvd::rip itself need some configuration for the cluster mode.


3.6.1 Global preferences

   Click for zoom
   
Start the dvd::rip GUI and open the preferences dialog and select the cluster options page. Here you find three parameters regarding the cluster mode. You can configure that the cluster control daemon will be started locally on your dvd::rip workstation. Otherwise you must specify the hostname of the machine, which runs the cluster control daemon. If you configured a local daemon, it will be started on demand, if no daemon is found running on the local machine. Don't change the default TCP port number.


3.6.2 Cluster control window

   Click for zoom
   
Quit the preferences dialog and press Ctrl+M or select the Cluster/Control menu item. The cluster control window should pop up. If you get an error message, please check if your data entered in the preferences dialog was correct and if the cluster control daemon is running.

The window is divided into four parts: the project queue, the job queue,the node list and a log area, with logging messages from the cluster control daemon. The lists are empty, because we neither configured cluster nodes nor pushed projects on the cluster.


3.6.3 Add nodes

Now it's time to add nodes to the cluster. Press on the Add Node button and the Edit Cluster Node window will be opened.

Basically one can divide nodes into two classes: a local and some remote nodes. The local node runs the cluster control daemon, so there is no SSH communication necessary to execute commands on it. Usually this node has also the harddisk connected, so no NFS is needed to access the data.

For remote nodes the cluster control daemon uses SSH to execute the commands, and they usually access the data through NFS or something similar.

dvd::rip passes I/O intensive jobs to the node with local disk access, because network access may slow down such jobs significantly.


3.6.3.1 Local node

Click for zoom   
Node with local data access
and cluster control daemon
   
The screenshot on the left shows the window for a node, which has the configuration 1.) of the two-node example above. That means, the node has local harddisk access, the cluster control daemon runs on it so that we don't need to provide a ssh username, because commands can be executed locally without using ssh.


3.6.3.2 Remote transcode node

Click for zoom   
Remote node for
transcoding only
   
This screenshot shows the configuration of a typical transcode node with NFS access, a correspondent mount point of the data directory and a username for making ssh connections.

dvd::rip uses ssh -x (extended with user@host command) to execute a command on a remote node. The -x option means: don't try to establish X forwarding. If this doesn't work for you (e.g. because you have to access the node through a firewall or similar stuff) you can add another ssh command, with the options of your choice.


3.6.4 Node configuration

   Click for zoom
    Cluster control window
with two nodes
You can specify additional transcode options for a node. These options are added to the internal transcode call, resp. they override corresponding options already computed by dvd::rip. E.g. you can specify -u 4,2 to increase performance on a two processor machine.

If you have multi processor machines, another option is to configure multiple nodes for them by providing different node names with the same hostname. For most other cases you can leave the hostname entry empty, if the node name is already a valid hostname.


3.6.5 Node testing

You can press the Test button at any time to check whether your configuration is correct or not. A window will popup which shows a brief result of several test cases, simply printing Ok or Not Ok. A longer description is added for all failed cases, which shows the output of the correspondent commands on the node, compared with the output of the command executed on the local machine. This way you easily can detect a wrong NFS mount point configuration, SSH problems or different transcode versions on the local machine and the node.


3.7 Work with the cluster

Content ] [ Top ]

Ok, now we have a proper cluster setup, what is missing is a project the cluster is working on.


3.7.1 Adding projects

First, rip and configure your project as usual. Exactly when you usally press the Transcode Button you press Add To Cluster instead. The cluster control window will be opened, if it's not already open. Also the Cluster Project Edit window will appear, where you can adjust some properties.


3.7.2 Project properties

   Click for zoom
    Cluster project properties
At first you can manipulate the number of frames per chunk. Default is 10000, which will be Ok for most cases. But if the performance of your cluster nodes differs much you can decrease this value to prevent slow nodes from blocking the whole cluster with transcoding a huge chunk while the others are idle. But: decreasing the chunk size too much makes 2-pass encoding useless, because the material for analysis becomes too short. You have to play with it to get good results with your environment.

Then you can select, if the resulting AVI file should be splitted after processing, and if temporary AVI files should be removed when they're not needed anymore. You should enable that, because in cluster mode the project needs up to 3 times more space than a normal project. Also the VOB files can be removed after transcoding, if diskspace is a problem.


3.7.3 Schedule the project

Your project (with the currently selected title) was added to the project queue. The initial state of the project is not scheduled. You can push as many projects as you want to the cluster this way, and set the priority by moving the project up and down in the queue (using the appropriate buttons).

Now simply press Schedule project for each project you want the cluster to work on. The state of the 1st project will switch to running. Also the state of all idle nodes will switch to running and as much jobs as possible will be executed in parallel.


3.7.4 Jobs

   Click for zoom
    This is a diagram of the
cluster job workflow
The job queue shows all tasks which must be completed. Mainly the work is divided into six phases:


3.7.4.1 Transcode video

As many nodes as possible will be used in parallel for this phase. They will transcode different chunks of the video from MPEG to AVI, but without audio.


3.7.4.2 Transcode audio

Due to technical reasons audio has to be transcoded independent from the video and it's not possible to break up the job into chunks which can be processed in parallel. If you selected more than one audio track, an appropriate number of audio transcoding jobs will appear.


3.7.4.3 Merge video + audio

The transcoded audio file of the first selected audio track and all video chunks are merged and multiplexed into one file. This is done preferable on the node with local harddisk access.


3.7.4.4 Merge additional audio

Additional audio tracks are merged to the result of Merge video + audio.


3.7.4.5 Merge PSU's

If the movie consists of more than one program stream unit, the steps above are repeatet for each unit. The corresponding files are then merged together.


3.7.4.6 Split

If you decided to split the AVI afterwards, this is the final phase.


3.7.5 Node statuses

As noted above, the cluster control daemon regularly checks the status of the transcode nodes. If a node goes offline the corresponding job will be cancelled automatically and later be picked up by another idle node. You can stop and start nodes by hand, using the corresponding buttons. The job the node is working on will be cancelled. A stopped node doesn't get jobs, even when the node is online. This way you can take a node out of the cluster dynamically, when you want to use it for other things.

You can remove a project only if it's not active resp. it's finished. You must first stop all nodes working on the project, then you can remove it.


3.8 Some notes about internals

Content ] [ Top ]

The dvd::rip cluster control daemon stores its state independently from your dvd::rip GUI workstation. That means, once you've added a project to the cluster, changes to the project done with dvd::rip will not affect the cluster operation.

The cluster control daemon stores its data in the ~/.dvdrip-master directory of the user, who executes the cluster control daemon. The node configuration and all projects are stored here.

The manipulation of these data is done via the Cluster Control window of dvd::rip. The dvd::rip workstation does not store locally any information of the cluster. The communication between the dvd::rip GUI and the daemon is done using a TCP based protocol, which enables dvd::rip creating objects and calling their methods transparently over the network.

Additionally the cluster control daemon listens to the port number 28656 and echos all log messages to connected clients. So you simply can telnet to the daemon on this port, to see what's going on (besides opening the dvd::rip Cluster Control window, which exactly does the same ;)


DOCUMENTATION

dvd::rip - Copyright © Jörn Reder, All Rights Reserved