MATLAB DISTRIBUTED COMPUTING SERVER 4 - SYSTEM ADMINISTRATORS GUIDE Bedienungsanleitung

Stöbern Sie online oder laden Sie Bedienungsanleitung nach Server MATLAB DISTRIBUTED COMPUTING SERVER 4 - SYSTEM ADMINISTRATORS GUIDE herunter. Distributed Computing Server System Administrator`s Guide Benutzerhandbuch

  • Herunterladen
  • Zu meinen Handbüchern hinzufügen
  • Drucken
  • Seite
    / 148
  • Inhaltsverzeichnis
  • LESEZEICHEN
  • Bewertet. / 5. Basierend auf Kundenbewertungen

Inhaltsverzeichnis

Seite 1 - System Administrator’s Guide

MATLAB®Distributed ComputingServer™System Administrator’s GuideR2013b

Seite 2 - Natick, MA 01760-2098

1 Introduc tionMATLAB Distributed Computing Server Product DescriptionPerform MATLAB®and Simulin k®computations on clusters, clouds,and gridsMATLAB Di

Seite 3 - Revision History

3 Prod uct InstallationStep 3: Validate Cluster ProfileIn this step you valid ate your cluster profile, and thereby your installation.1 If it is not a

Seite 4

Configure for a Generic SchedulerNote If your validation fails any stage, contact the MathWorks installsupport team.If your validation passed, you n o

Seite 5 - Contents

3 Prod uct Installation3-54

Seite 6

4Admin Center• “Start Admin Center” on page 4-2• “Set Up Resources” on page 4-3• “Test Connectivity” on page 4-11• “Export and Im po rt Sessions” o n

Seite 7

4 Admin C enterStar t Admin CenterAdmin Center is a graphical user interface with which y ou can control andmonitor the MATLAB Distributed Computing S

Seite 8

Set Up ResourcesSet Up ResourcesIn this section...“Add Hosts” on page 4-3“Start mdce Service” on page 4-4“Start an MJS” on page 4-5“Start Workers” on

Seite 9 - Introduction

4 Admin C enterStart mdce ServiceA host must be running the mdce service if an MJS or worker is to run on thathost. Normally, you set this up with Adm

Seite 10 - 1 Introduc tion

Set Up ResourcesA dialog box leads you through the procedure of starting the mdce service onthe selected h osts. There are five steps to the procedure

Seite 11 - Product Overview

4 Admin C enterIn the New MATLAB Job Scheduler dialog b ox, provide a name for the MJS,and s elect a host to run it on.Alternative methods for startin

Seite 12

Set Up ResourcesStart WorkersTo start MATLAB workers, click Start in the Workers module.In the Start Workers dialog box, s pecify the numbers of worke

Seite 13 - Toolbox and Server Components

Product OverviewProduct OverviewIn this section...“Parallel Com puting Concepts” on page 1-3“Determining Product Installation and Versions” on page 1-

Seite 14

4 Admin C enterAlternative methods for starting workers include s electing the pull-downWorkers > Start, or right-click in g a li sted host or MJS

Seite 15

Set Up ResourcesTo get more info rmation on any host, MJS, or worker listed in A dm in Center,right-click its name in the display and select Propertie

Seite 16

4 Admin C enterMove a WorkerTo move a worker from one host to another, you must completely shut it down,than start a new worker on the desired host:1

Seite 17

Test ConnectivityTest ConnectivityAdmin Center lets you test communications between your MJS node, workernodes, and the node where A dm in Center is r

Seite 18

4 Admin C enterWhen the tests are complete, the Running Tests dialog box automaticallycloses, and Admin C enter displays the test results in the Conne

Seite 19 - Network Administration

Test ConnectivityTest that include failures or other results might look like the following figure.Double-click any of the symbols in the test results

Seite 20 - 2 Netw ork Administration

4 Admin C enterExpor t and Impor t SessionsBy default, Admin C enter saves the cluster definition, process status, andtest results, so the next time t

Seite 21 - Fully Qualified Domain Names

Prepare for Cluster ProfilesPrepare for Cluster ProfilesAdmin Cente r does not create cluster profiles, but the inform ation displaye din Admin Center

Seite 22

4 Admin C enter4-16

Seite 23 - Install and Configure

5Control Scripts —Alphabetical List

Seite 24

1 Introduc tionMATLAB WorkerSchedulerMATLAB ClientParallelComputingToolboxMATLAB DistributedComputing ServerMATLAB WorkerMATLAB DistributedComputing S

Seite 25

admincenterPurpose Start Admin Center GUISyntax admincenterDescription admincenter opens the MATLAB Distributed Computing Server AdminCenter. When set

Seite 26

createSharedSecretPurpose Create shared secret for secure communicationSyntax createSharedSecretcreateSharedSecret -file <filename>Description c

Seite 27

mdcePurpose Install, start, stop, or uninstall mdce serviceSyntax mdce installmdce uninstallmdce startmdce stopmdce consolemdce restartmdce ... -mdced

Seite 28

mdcemdce stop stops running the m dce service. This automatically stops alljob m anagers and workers on the computer, but leaves their checkpointinfor

Seite 29 - <MyJobManager> -v

nodestatusPurpose Status of mdce processes running on nodeSyntax nodestatusnodestatus -flagsDescription nodestatus displays the status of the mdce ser

Seite 30

nodestatusFlagOperation-baseport <port_number>Specifies th e base port th at themdce service on the remote hostis using. You need to specify thi

Seite 31 - Custom Star tup Parameters

remotecopyPurpose Copy file or folder to or from one or more remote hosts using transportprotocolSyntax remotecopy <flags><protocol options&g

Seite 32

remotecopyFlags and OptionsOperation-quietPrevent remotecopy from prompting formissing information. The command fails ifall required information is no

Seite 33 - Override Script Defaults

remotecopyRetrieve folders of the same name from two hosts to the local machine.(Enter command on a single line.)remotecopy -local C:\temp\log -from -

Seite 34

remotemdcePurpose Execute mdce command on on e or more remote hosts by transportprotocolSyntax remotemdce <mdce options><flags><protoco

Seite 35 - Access Serv ice R ecord Files

Toolbox and Server ComponentsToolbox and Server ComponentsIn this section...“Schedulers, Workers, and Clients” on page 1-5“Third-Party Schedulers” on

Seite 36

remotemdceFlags and OptionsOperation-protocol <type>Force the usage of a particular protocoltype. Specifying a protocol type with all itsrequire

Seite 37 - SECURITY_LEV EL

remotemdceStart mdce in a clean state on two UNIX operating system machinesfrom a W indow s operating syste m machine, using the ssh protocol.Enter th

Seite 38

star tjobmanagerPurpose Start job manager p rocessSyntax startjobmanagerstartjobmanager -flagsDescription startjobmanager starts a job manager process

Seite 39 - SetSecureCommunication

star tjobmanagerFlagOperation-cleanDeletes all checkpoint information storedon disk from previous instance s of this jobmanager b efore starting . Thi

Seite 40

star tworkerPurpose Start MA TLAB w orker sessionSyntax startworkerstartworker -flagsDescription startworker starts a MATL AB worker process under the

Seite 41 - Troubleshoot Common Problems

startworkerFlagOperation-jobmanagerhost <job_manager_hostname>Specifies the host on which the jobmanager is running. The worker contactsthe job

Seite 42

star tworkerStart two workers, named worker 1 and w orke r2, on the hostWorkerHost, registering with the job manager MyJobManager that isrunning on th

Seite 43 - Required Ports

stopjobmanagerPurpose Stop job manager processSyntax stopjobmanagerstopjobmanager -flagsDescription stopjobmanager stops a job manager that is running

Seite 44

stopjobmanagerFlagOperation-baseport <port_number>Specifies th e base port th at themdce service on the remote hostis using. You need to specify

Seite 45 - Host Communicatio ns Problems

stopworkerPurpose Stop MATLAB worker sessionSyntax stopworkerstopworker -flagsDescription stopworker stops a MATLAB worker process that is running und

Seite 46

1 Introduc tionWorkerSchedulerClientWorkerWorkerClientJobAll ResultsJobAll ResultsTaskResultsTaskResultsTaskResultsInteractions of Parallel Computing

Seite 47

stopworkerFlagOperation-baseport <port_number>Specifies th e base port th at themdce service on the remote hostis using. You need to specify thi

Seite 48

GlossaryGlossaryCHECKPOINTBASEThenameoftheparameterinthemdce_def file that defines the locationof the checkpoint directories for the MATLAB job schedu

Seite 49 - Product Installation

Glossarydistributed applicationThe same application that runs independently o n several nodes, possiblywith different input parameters. There is no co

Seite 50

Glossaryhomogeneous clusterA cluster of identical machines, in terms of both hardware and software.independent jobA job compose d of independent tasks

Seite 51 - Client Node

Glossarymdce_d ef fileThe file that defines all the defaults for the mdce processes by allowingyou to set preferences or definitions in the form of pa

Seite 52

Glossaryspmd (single program multiple data)A block of code that ex ecutes simultaneously on multiple w orke rs ina parallel pool. Each worker can oper

Seite 54

IndexIndexAadmincenter control script 5-2administrationnetwork 2-1Ccheckpoint folderlocating 2-18clean statestarting services 2-16clientprocess 1-5con

Seite 55

IndexRremotecopy control script 5-8remotemdce control script 5-11requirements 2-3Sschedulerthird-party 1-6security 2-4startjobmanager control script 5

Seite 56

Toolbox a nd Server Componentsscheduler, PBS Pro scheduler, TORQUE schedu ler, m p iexec, or a genericscheduler.Choosing Between a Scheduler and MJSYo

Seite 57

1 Introduc tion• Who administers your cluster?The person administering your cluster might have a preference for howjobs are scheduled.Components on Mi

Seite 58

Using Parallel Computing Toolbox™ SoftwareUsing Parallel Computing Toolbox SoftwareA typical Parallel Computing Toolbox client s ession includes the f

Seite 59

1 Introduc tion1-10

Seite 60

2Network AdministrationThis chapter provides information useful for network administration ofParallel Comp u t in g T o ol bo x sof twa re and MATL AB

Seite 61

How to Contact MathWorkswww.mathworks.comWebcomp.soft-sys.matlab Newsgroupwww.mathworks.com/contact_TS.html Technical [email protected] Pro

Seite 62 - MyMJS to run on host node1

2 Netw ork AdministrationPrepare for Parallel ComputingIn this section...“Plan Your Network Layout” on page 2-2“Network Requirements” on page 2-3“Full

Seite 63

Prepare for Parallel Computingrunning on all machines that run job manager sessions or workers that areregistered with a job manager. (The mdce servic

Seite 64

2 Netw ork AdministrationSecurity ConsiderationsThe parallel computing products do not provide any security measures.Therefore, be aware of the follow

Seite 65

Install and ConfigureInstall and ConfigureTo find the most up-to-date instructions for installing and configuringthe current or past versions of the p

Seite 66

2 Netw ork AdministrationUse Different MPI Builds on UNIX SystemsIn this section...“Build MPI” on page 2-6“Use Your MPI Build” on page 2-6Build MPITo

Seite 67

Use Different MPI Builds on UNIX®Systems1 Test your build by running the mpiexec executable. The build should beready to test if itsbin/mpiexec and li

Seite 68

2 Netw ork Administrationany), together. Set the configuration’s MpiexecFil eNam e property to/opt/mpich2/mpich2-1.4.1p1/bin/mpiexec.• If you are usin

Seite 69 - Time (UNIX)

Shut Down a Job Manager ClusterShut Down a Job Manager ClusterIn this section...“UNIX and Macintosh Operating Systems” on page 2-9“Microsoft Windows O

Seite 70

2 Netw ork AdministrationIfyouhavemorethanoneworkersessionrunning,youcanstopeachofthem individually by host and name.stopworker -name worker1 -remoteh

Seite 71

Shut Down a Job Manager ClusterMicrosoft Windows Operating SystemsStop the Job Manager and WorkersEnter the commands of this section at the prompt in

Seite 72

Revision HistoryNovember 2005 Online only New for Version 2.0 (Release 14SP3+)December 2005 Online only Revised for V ersion 2.0 (Release 14SP3+)March

Seite 73

2 Netw ork Administrationservice while leaving the machine on, enter the following commands a t aDOS com m and prompt:cd matlabroot\toolbox\distcomp\b

Seite 74

Custom Startup ParametersCustom Star tup ParametersIn this section...“Define Script Defaults” on page 2-13“Override Script Defaults” on page 2-15The M

Seite 75

2 Netw ork AdministrationNote If you want to run more than one job manager on the same machine,they must all have unique nam es. Spe cify the names us

Seite 76 - Configure for HPC Server

Custom Startup ParametersPrivilegePurposeLocal Security SettingsPolicySeServiceLogonRightRequired to log on using theservice logon type.Log on as a se

Seite 77 - CLUSTER_NAME.Ifyou

2 Netw ork AdministrationAlternatively, you can make a copy of this file, modify the copy, and specifythat this copy be used for the default parameter

Seite 78

Access Service Record FilesAccess Serv ice R ecord FilesIn this section...“Locate Log Files” on page 2-17“Locate Checkpoint F olders” on page 2-18The

Seite 79 - Configure for HP C Server

2 Netw ork AdministrationLocate Checkpoint FoldersCheckpoint folders contain information related to persistence data, whichthe server services use to

Seite 80

Set MJS Cluster SecuritySet MJS Cluster SecurityIn this section...“Set the Security Level” on page 2-19“Local, MJS, and Network Passwords” on page 2-2

Seite 81

2 Netw ork AdministrationSecurityLevelDescription User Requi re ments• Tasks run as the user who started themdce process on the worker machines(typica

Seite 82

Set MJS Cluster SecuritySecurityLevelDescription User Requi re mentsyour system/network user name andpassword, because the worker mustlog you in to ru

Seite 84

2 Netw ork AdministrationYou must also provide a value for the SHARED_SECRE T_FILE parameter in themdce_def file, identifying where the file can be fo

Seite 85 - TORQUE Scheduler

Troubleshoot Common ProblemsTroubleshoot Common ProblemsIn this section...“License Errors” on page 2-23“Memory Errors on UNIX Operating Systems” on pa

Seite 86 - JobStorageLocation

2 Netw ork Administration• If you receive this error w hen starting a worker with MATLAB DistributedComputing Server software:- You may be calling the

Seite 87 - 3 Click Validate

Troubleshoot Common Problems- If you installed only the Parallel Computing T oolbox product, and youare attempting to run a worker on the same machine

Seite 88

2 Netw ork AdministrationWith Third-Party SchedulerBefore the worker processes start, you can control the range of ports used bythe workers for commun

Seite 89

Troubleshoot Common ProblemsEphemeral TCP Ports with Job ManagerIf you use the jobmanager on a cluster of nodes running Windows operatingsystems, you

Seite 90

2 Netw ork AdministrationWith Command-Line InterfaceFirst, be sure that the machines in question agree on their IP resolutions. TheIP address for a pa

Seite 91

Troubleshoot Common ProblemsVerify Multicast CommunicationsNote Multicast is required on the head node running the MATLAB jobscheduler (MJS) and on th

Seite 92

2 Netw ork AdministrationThe following example shows how to use the Java class inside MATLAB.Start MATLA B on two machines (e.g.,host1name and h ost2

Seite 93 - Using Passwordless Delegation

3Product Installation• “Install Products and Choose Cluster Configuration” on page 3-2• “ConfigureforanMJS”onpage3-5• “Configure for HPC Server” on pa

Seite 94

ContentsIntroduction1MATLAB Distributed Computing Server ProductDescription... 1-2Key Features...

Seite 95

3 Prod uct InstallationInstall Products and Choose Cluster ConfigurationIn this section...“Cluster Descriptio n ” on page 3 -2“Install Products” on pa

Seite 96

Install Products and Choose Cluster ConfigurationMDCS ClusterClient Node PCTProduct Installations on Client NodesInstall ProductsOn the Cluster Node

Seite 97 - \toolbox\local

3 Prod uct InstallationConfigure Your ClusterWhen the c luster an d client insta l lations are complete, you can proceed toconfigure the products for

Seite 98

Configure for an MJSConfigure for an MJSIn this section...“Configure Cluster to Use a MATLAB Job Scheduler (MJS)” on page 3-5“Configure Windows Firew

Seite 99 - @deleteJobFcn

3 Prod uct InstallationStep 1: Set Up Windows Cluster HostsIf this is the first installation of MATLAB Distributed C omputing Serveron a cluster of W

Seite 100 - 3 Prod uct Installation

Configure for an MJSmatlabroot\toolbox\distcomp\bin\mdce_def.bat2 Find the line for setting the MDCEUSER parameter, and p rovide a value inthe f ormdo

Seite 101

3 Prod uct Installationcd oldmatlabroot\toolbox\distcomp\bin3 Sto p and uninstall th e old mdce service and remove its associated files b ytyping the

Seite 102

Configure for an MJSUsing A d min Center GUI.Note To use Admin Center, you must run it on a computer that hasdirect network connectivity to all the n

Seite 103 - Admin Center

3 Prod uct Installationb Click Add or Find.The Add or Find Hosts dialog box opens.c Select Enter H ostnam es , then list your hosts in the text box. Y

Seite 104 - Star t Admin Center

Configure for an MJSKeep the check to start mdce service.d Click OK to open the Start mdce service dialog box. Proceed through thesteps clicking Next

Seite 105 - Set Up Resources

Use Your MPI Build ... 2-6Shut Down a Job Manager Cluster... 2-9UNIX and Macintosh Operating Systems...

Seite 106 - 4 Admin C enter

3 Prod uct InstallationIt might take a moment for Admin Center to communicate with all thenodes, start the services, and acquire the status of all of

Seite 107 - Start an MJS

Configure for an MJSIf any of the connectivity tests fail, double-click the icon that indicates afailure to get in formation about tha t sp ecif ic te

Seite 108

3 Prod uct Installationa T o start an MJS (job m an a ge r), c lick Start in the MJS module. (Th is isone of several ways to open the New MJS dialog b

Seite 109 - Start Workers

Configure for an MJSe Click OK to start the workers and return to the Admin Center dialogbox. It might take a moment for Admin Center to initialize al

Seite 110

3 Prod uct InstallationIf you encounter any problems or failures, contact the MathWorks installsupport team.For more information about Admin Center fu

Seite 111

Configure for an MJSCommand Window,andselectRun as A dministrator.Thisoptionis available only if you are running User Account Control (UAC).ii If you

Seite 112

3 Prod uct Installation2 Start the MJSTo start the MATLAB job scheduler (MJS), enter the following comm andsin a DOS command window. You do not have t

Seite 113 - Test Connectivity

Configure for an MJScd matlabroot\toolbox\distcomp\binb Start the workers on each node, using the text for <MyMJS> that identifiesthe name of th

Seite 114

3 Prod uct Installationindicate protoco l, platform (such as in a mixed environment), or othe rinformation, see the help forremotemdce by typing./remo

Seite 115

Configure for an MJScd matlabroot/toolbox/distcomp/binb Start the workers on each node, using the text for <MyMJS> that identifiesthe name of th

Seite 116 - .mdcs tothefilename

Configure C luster to Use a MATLAB Job Scheduler(MJS)... 3-5Configure Windows Firewalls on Client...

Seite 117 - Prepare for Cluster Profiles

3 Prod uct InstallationDebian, Fedora Platforms. On each cluster node, register the mdce serviceas a known service and configure it to start automatic

Seite 118

Configure for an MJS4 L ook in /etc/initt ab for the default run level. Create a link in the rcfolder associated with that run level. For example, if

Seite 119 - Alphabetical List

3 Prod uct Installationsudo ln -s matlabroot/toolbox/distcomp/bin/mdce /usr/sb in/m dce3 Copy the launchd .plist file for m dce to /Library/LaunchDa e

Seite 120 - Syntax admincenter

Configure for an MJS1 On the client computer where Parallel Computing Toolbox is installed,openaDOScommandwindow(forWindowssoftware)orashell(forUNIXso

Seite 121 - See Also mdce

3 Prod uct Installation5 Click Done to sa ve your cluster profile.Step 3: Validate the Cluster ProfileIn this step you valid ate your cluster profile,

Seite 122 - Syntax mdce install

Configure for an MJSNote If your validation does not pass, contact the MathWorks install supportteam.If your validation passed, you now have a val id

Seite 123

3 Prod uct InstallationConfigure for HPC ServerIn this section...“Configure Cluster for Microsoft Windows HPC Server” on page 3-28“Configure Client Co

Seite 124 - Syntax nodestatus

Configure for HP C ServerNote If you need to override the script default values, modify thevalues defined inMicrosoftHPCServerSetup.xml before running

Seite 125

3 Prod uct InstallationNote Ifyouneedtooverridethedefaultvaluesthescript,modifythe values defined inMicrosoftHPCServerSetup.xml before runningMicrosof

Seite 126

Configure for HP C Serverb Set the NumWorkers field to the number of w orkers y ou want to runthe validation t ests o n, within the limitation o f you

Seite 127

Test Connectivity ... 4-11Export and Im port Sessions... 4-14Prepare for Cluster Profiles...

Seite 128 - See Also remotemdce

3 Prod uct Installation5 Click Done to sa ve your cluster profile.Step 2: Validate the ConfigurationIn this step you valid ate your cluster profile, a

Seite 129

Configure for HP C ServerNote If your validation does not pass, contact the MathW orks install supportteam.If your validation passed, you n ow have a

Seite 130

3 Prod uct InstallationConfigure for PBS Pro, Platform LSF, TORQUEIn this section...“Configure Platform LSF Scheduler on Windows Cluster” on p ag e 3-

Seite 131 - See Also mdce

Configure for PBS Pro, Platform LSF, T ORQUETo use mpiexec to distribute a job, the smpd service must be running on allnodes that will be used for run

Seite 132 - Syntax startjobmanager

3 Prod uct Installation4 If you are using Windows firewalls on your cluster nodes, execute thefollowing in a DOS command window.matlabroot\toolbox\dis

Seite 133

Configure for PBS Pro, Platform LSF, T ORQUEshared installation), e xecute the following comm and in a DOS commandwindow.matlabroot\bin\matlab.bat -in

Seite 134 - Syntax startworker

3 Prod uct Installation1 Start the Cluster Profile Manager from the MA TLAB desktop by selectingon the Home tab in the Environment area Parallel >

Seite 135 - -remotehost WorkerHost

Configure for PBS Pro, Platform LSF, T ORQUE5 Click Done to sa ve your cluster profile.Step 2: Validate the Cluster ProfileIn this step you verify you

Seite 136

3 Prod uct InstallationNote If your validation does not pass, contact the MathW orks install supportteam.If your validation passed, you n ow have a va

Seite 137 - Syntax stopjobmanager

Configure for a Generic SchedulerConfigure for a Generic SchedulerIn this section...“Interfacing with Gene ric Schedulers” on page 3-42“Configure Gene

Seite 138

1Introduction• “MATLAB®Distributed Computing Server™ Product Description” onpage 1-2• “Product Overview” on page 1-3• “Toolbox and Server Components”

Seite 139 - Syntax stopworker

3 Prod uct InstallationInterfacing with Generic Schedulers• “Support Scripts” on page 3-42• “Submission Mode” on page 3-42Support ScriptsTo support us

Seite 140

Configure for a Generic SchedulerBefore using the support scripts, decide which submission mode describesyour particular network setup.Configure Gener

Seite 141 - Glossary

3 Prod uct Installation2 Start smpd by typing in a DOS command window one of the following,as appropriate:matlabroot\bin\win32\smpd -installormatlabro

Seite 142 - Glossary-2

Configure for a Generic Scheduler8 Repeat all these steps on all Window s nodes in your cluster.Using Passwordless Delegation1 Log in as a user with a

Seite 143 - Glossary-3

3 Prod uct InstallationConfigure Sun Grid Engine on Linux ClusterTo run communicating jobs with MATLAB Distributed Computing Serverand Sun™ Grid Engin

Seite 144 - Glossary-4

Configure for a Generic Schedulerqconf -mq all.qThis will bring up a text editor for you to make changes: search for the linepe_list,andaddmatlab.5 En

Seite 145 - Glossary-5

3 Prod uct InstallationNote The remainder of this chapter illustrates only the case of using LSF ina nonshared file sy stem. For other schedulers or a

Seite 146 - Glossary-6

Configure for a Generic SchedulerIn this type of configuration, job data is copied from the client host runninga Windows operating system to a host on

Seite 147

3 Prod uct Installation2 Start the Cluster Profile Manager from the MA TLAB desktop by selectingParallel > Manage Cluster Profiles.3 Create a new p

Seite 148

Configure for a Generic Schedulerg Set the OperatingSystem to the operating system of your clusterworker machines.h Set HasSharedFilesystem to false,

Kommentare zu diesen Handbüchern

Keine Kommentare