Building and Running a SAS Socket Server

Building and Running a SAS® Socket Server Anthony M. Dymond, Dymond and Associates, LLC, Concord, California ABSTRACT Any computer connected to the In...
Author: Andrew Lamb
21 downloads 0 Views 33KB Size
Building and Running a SAS® Socket Server Anthony M. Dymond, Dymond and Associates, LLC, Concord, California ABSTRACT Any computer connected to the Internet that can read and write to a socket can talk to any other computer with the same capabilities. This means that almost any computer today, regardless of hardware platform, operating system, or programming language, can talk to any other computer over the Internet. SAS® software supports socket server connections, allowing other computers with or without SAS loaded on them to start SAS jobs and recover files from remote locations. This paper briefly describes Internet sockets, describes a socket server written in the Base SAS language, and then describes a socket client written in Java™ that can communicate to the SAS socket server. Starting and using a SAS socket server will be demonstrated.

INTRODUCTION Controlling SAS jobs from remote locations presents some interesting opportunities. For example, several SAS socket servers are an alternative to multitasking that requires expensive hardware and software. Dispersing several SAS jobs onto smaller machines using sockets avoids these expenses, avoids multitasking's platform dependence, and improves hardware utilization. As an example, in the evening several co-workers can leave their workstations on and running a socket server. One person can then run a production job composed of several SAS jobs by starting jobs on the co-workers' machines. Since sockets use the Internet to communicate, it is as easy to use computers from a remote location as it is to use local machines. Sockets allow remote computers that are certified for special purposes or have access to unique data stores to be made available from a central location. Industries such as biotech can reduce costs by passing SAS jobs into a central verified computer. Many industries, such as banks and telecommunications, have a substantial investment in legacy SAS code to run complex production processes. Business Process Management (BPM) and workflow software is available to build an external intelligent scheduler/controller for these large production environments if a way can be found to launch SAS jobs and read their output. A SAS socket connection will make this possible.

ADDRESSING AND USING A SERVER SOCKET The Internet converts a server name into an address as: http://www.somename.com --> 123.456.789.123 When a client contacts a server it expects to connect to a particular application. The server may be running hundreds of applications, and the client finds the one it wants by knowing the correct port number on the server. For example, if the client is a web browser, it expects the application on the server to be a web server that by common agreement is located at port 80. So in this case: browser(http://www.somename.com) --> 123.456.789.123: 80 A web server is just one type of application that can run on the server. Many applications other than web servers can be running behind their own designated ports, and many types of clients other than web browsers can connect to them if they know both the server address and the port number. Once a connection is made, the client and server communicate by reading and writing messages to each other. The programming language used determines the format of the messages, but this usually looks like text strings or files. How the client and server application interpret these messages depends entirely on how they were programmed.

1

WRITING A SAS SOCKET SERVER A SAS socket server can be written using only the Base SAS language. It does not require any additional software such as SAS/IntrNet or SAS/Connect. We will demonstrate a simple but useful server that allows a client (written in any language) to run specific SAS programs and to read specific files back to the client. Adding controls for specific programs and files adds some clutter to the example, but this is justified by the need to consider security for any application exposed to the Internet. An exposed server application would also need user name and password controls, but even if a hacker obtained these the specific program and file controls shown here would provide limits and protection. SAS uses a file to represent the socket connection. File writes and reads are used to pass information to and from the SAS server. This example is written in SAS version 8 and the socket server only allows one connection at a time, and this connection only exists within the scope of a data step. The client connects to the server, the client passes a message to the server, the server passes a message back to the client, and the server ends the connection. For anyone new to the Internet this can seem like a strange way of doing things, but it is in fact the way most client-server Internet applications function.

SAS Session infile

Socket = file

macro loop; data step; parse bye query launch

input;

connect

run;

disconnect

pgm launch

The SAS socket server is implemented by a data step within a macro loop. The server listens for connections from a client only when the program pointer is at the INPUT line. One socket connection exists only within the scope of the data step.

Below is the SAS server file listing. Notes in the body of the code are expanded below the listing.

2

/*---------------------------------------------------------------------------*/ /* ...socketDemo\server.sas */ /* */ /* A socket server written in Base SAS(r). */ /* */ /* It will run specific SAS programs based on the client commands: */ /* LAUNCH RUNTHIS */ /* LAUNCH RUNTHAT */ /* */ /* It will return specific files based on the client commands: */ /* QUERY READTHIS */ /* QUERY READTHAT */ /* */ /* It will stop the loop based on the client command: */ /* BYE */ /* */ /* NOTE: This code is informational only. It is not supported. It has */ /* not underdone a security audit and should not be used without */ /* review and enhancement by qualified technical staff. */ /* */ /* */ /* Copyright (C) 2003 Dymond and Associates, LLC, All rights reserved. */ /*---------------------------------------------------------------------------*/ options source notes noxwait; filename _ALL_ CLEAR; proc printto log = log; run;

/* reset SAS log file */

%global launch loop thisroot; /*** change 'thisroot' to match your directory %let thisroot = c:\datasets\code\socketDemo;

[note 1]

***/

/*** The socket server runs on 'localhost' port 5100 filename srvsoc SOCKET 'localhost:5100' SERVER reconn=0 ;

[note 2]

***/

/*** Files to query (client can request they be returned from the server) %let READTHIS = &thisroot.\readthisfile.txt; %let READTHAT = &thisroot.\readthatfile.txt; filename READTHIS "&READTHIS"; filename READTHAT "&READTHAT"; /* Only these files can be requested from the server %let querylist = *READTHIS*READTHAT*; /*** Programs that can be run on the server. %let RUNTHIS = &thisroot.\runthisprogram.sas; %let RUNTHAT = &thisroot.\runthatprogram.sas; /* Only these sas programs can be run (launched) from the server %let launchlist = *RUNTHIS*RUNTHAT*;

***/

*/

***/

*/

/* In the 'query' section of the data step files can be selected for return */ /* from the server. Only allow files that physically exist. */ %macro goodfiles; %if %sysfunc(fexist(READTHIS)) %then if argument='READTHIS' then infile READTHIS end=lastrow %str(;); %if %sysfunc(fexist(READTHAT)) %then if argument='READTHAT' then infile READTHAT end=lastrow %str(;); %mend goodfiles;

3

/*-----------------------------------------------------------------------------*/ /* The macro loop will run as long as &loop>0. The data step inside the loop */ /* will allow one new connection that will exist only within the scope of the */ /* step. Ending the data step disconnects the client socket. */ /* The data step waits at the line 'input' until a line is provided by the */ /* client socket. */ /*-----------------------------------------------------------------------------*/ %macro sassrv; %do %while(&loop); %put STARTING SAS SOCKET SERVER LOOP; %let launch=;

[note 3]

/*------------------------------------------------------------------------*/ /* Send log to server log file. */ /*------------------------------------------------------------------------*/ /*proc printto log = "&srvlog"; run; */ /*------------------------------------------------------------------------*/ /* Wait for client to connect, then read one line. Parse the line and */ /* bind 'command' and 'argument' variables. */ /*------------------------------------------------------------------------*/ data _null_; length linein command argument $ 256; infile srvsoc; [note 4] /* point to server as input file */ input; [note 5] /* wait here to read one line from client */ linein=left(_infile_); linein=substr(linein,1,length(linein)-1); /* trailing crud character */ put linein=; if upcase(linein)=:'BYE' then do; command='BYE'; end; else if upcase(linein)=:'QUERY' then do; arg= trim(left(substr(linein,6))); command='QUERY'; argument=trim(arg); end; else if upcase(linein)=:'LAUNCH' then do; arg= trim(left(substr(linein,7))); command='LAUNCH'; argument=trim(arg); end; else do; file log; put 'Cannot parse command.'; file srvsoc; put 'Cannot parse command.'; end; file log; put command=; put argument=;

4

/*---------------------------------------------------------------------*/ /* BYE: Shutdown server if requested. [note 6] */ /*---------------------------------------------------------------------*/ if command='BYE' then do; file log; put 'Server stop is being requested by client.'; /* send to sas log */ file srvsoc; put 'Server stop is being requested by client.'; /* send to client */ call symput('loop','0'); /* stop loop on next pass */ end; /*---------------------------------------------------------------------*/ /* QUERY: Write a predefined file to the client. */ /*---------------------------------------------------------------------*/ if command='QUERY' then do; if NOT index("&querylist",'*'||trim(argument)||'*') then do; file log; put 'This file cannot be queried'; file srvsoc; put 'This file cannot be queried'; end; else if NOT fexist(argument) then do; file log; put 'The file does not exist'; file srvsoc; put 'The file does not exist'; end; else do; file log; put 'Ready to read a file'; file srvsoc; put 'Ready to read a file'; %goodfiles; [note 7] /* select the infile */ do until(lastrow); input; put _infile_; end; end; end; /*---------------------------------------------------------------------*/ /* LAUNCH: Run a predefined program. */ /*---------------------------------------------------------------------*/ if command='LAUNCH' then do; if NOT index("&launchlist",'*'||trim(argument)||'*') then do; file log; put 'This file cannot be run'; file srvsoc; put 'This file cannot be run'; end; else do; file log; put 'Launching program ' argument; file srvsoc; put 'Launching program ' argument; call symput('launch',trim(argument)); end; end; stop; run;

5

/*---------------------------------------------------------------------------*/ /* RUN SAS PROGRAM: use %include to start a sas program. [note 8] */ /*---------------------------------------------------------------------------*/ %put launch=***&launch***; %if &launch^= %then %do; %include "&&&launch"; %end; /*---------------------------------------------------------------------------*/ /* Release the SAS log. */ /*---------------------------------------------------------------------------*/ proc printto log=log; run; %end; %mend sassrv; %let loop=1; %sassrv;

[note 1] This is the directory where this program runs. Change it to match your computer. [note 2] SAS makes it easy to establish a socket by using a FILENAME statement. This line of code creates a server socket named "srvsoc" at port 5100 on the computer where the code is running. The Base SAS code can now address this socket as if it was a file. Only one connection at a time is allowed to the server. "localhost" is the loopback address for your local computer and allows testing without being connected to the outside Internet. Change this as 'www.somename.com' to connect to a remote machine. [note 3] This is the beginning of a macro loop that will run as long as the loop variable &loop > 1. The top part of the loop contains a data _null_ step that will service the socket connection. The bottom part has a few lines of macro code that are used to start a SAS program if this was requested. [note 4] Read data from the "file" srvsoc. [note 5] The data _null_ step will wait here until a client connects and transmits a "file." Only one line that should contain the command to the server is read. The connection will exist until the end of this data step at which time the server will end the connection with the client. Note that this line is the only point where the server is able to receive a new connection. When the code is executing elsewhere within the macro loop the server will not respond to a client's request for a connection. [note 6] There will be one write to the SAS log (file log) and a second write to the client through the socket (file srvsoc). [note 7] The macro "goodfiles" ensures that lines of code containing the INFILE statement will be written only when the referenced file actually exists. [note 8] This macro code occurs after the end of the data _null_ step (and the end of the client socket connection) and uses a %INCLUDE to run a specific SAS program if this was requested.

6

TESTING THE SERVER SOCKET Begin by starting the SAS program shown above. The macro loop will cause the job to be continuously "running" in the SAS program window. A telnet window is a simple way to test a socket server. On a Windows® machine, use Start -> Run -> telnet to start the telnet application. When the telnet window opens use Connect -> Remote System... and then enter the connection information as: Host Name: Port: TermType:

localhost 5100 vt100

If a connection is made the text at the top of the telnet window should change from "Telnet-[None]" to "Telnetlocalhost." The connection to the server can be confirmed by entering a few random keystrokes followed by a carriage return. The server should respond with the message "Cannot parse command" and the socket connection should be released. If this is successful the remainder of the server functions can be exercised. Recall that the server ends the connection and it will have to be reestablished each time prior to issuing a command.

WRITING A JAVA SOCKET CLIENT We want to replace the manual telnet window with a client written in some programming language. A number of languages are capable of doing this, and the socket client shown below is written in Java. This client is run from a DOS window. It attempts to connect to the socket server, transfer the command line arguments to the socket server, and then read and echo all the lines returned from the socket server.

import java.io.*; import java.net.*; /*===================================================================* * ...socketDemo\Client.java * * * * * * A Java client socket talking to a SAS server socket. * * * * * * Copyright (C) 2003 by Dymond and Associates, LLC. * * All rights reserved. * *===================================================================*/ /** * Usage: [note 1] * java Client BYE //stop SAS server * java Client "QUERY READTHIS" // read files from SAS server * java Client "QUERY READTHAT" * java Client "LAUNCH RUNTHIS" // start jobs on SAS server * java Client "LAUNCH RUNTHAT" */ public class Client { //-------------------------------------------------------------// // constructor // //-------------------------------------------------------------// public Client(String[] args) { super(); String linein; Socket client = null; int counter = 0; 7

System.out.println("args: "+args[0]); //----------------------------------------------------------// // Loop while trying to connect to the SAS server socket. // // Sleep for one second between attempts to connect. // //----------------------------------------------------------// while(true) { [note 2] try { client = new Socket("localhost",5100); //try to connect if (client!=null) { break; } //got connection } catch (IOException exc) { //did not get connection counter++; if (counter>10) { System.out.println("Java client timed out waiting for a socket."); System.out.println("LAST EXCEPTION: "+exc); return; } else { try { Thread.sleep(1000); } //sleep one second catch (InterruptedException iexc) {} } } } //----------------------------------------------------------// // Send the command line argument out the socket. // //----------------------------------------------------------// try { BufferedReader inSoc = new BufferedReader( new InputStreamReader(client.getInputStream())); PrintWriter outSoc = new PrintWriter( new BufferedOutputStream(client.getOutputStream()), true); outSoc.println(args[0]);

[note 3]

//----------------------------------------------------------// // Read what comes back and echo it to sysout. // //----------------------------------------------------------// System.out.println("Beginning read loop."); while(true) { linein=inSoc.readLine();

[note 4]

if (linein==null || linein.trim().equalsIgnoreCase("BYE") ) { break; } else { System.out.println("from sas: "+linein); } } client.close(); } catch (Exception exc) { System.out.println("Error: "+exc); } } // end constructor

8

//-------------------------------------------------------------// // main() // //-------------------------------------------------------------// public static void main(String[] args) { Client cl = new Client(args); } // end main } //end class Client

[note 1] This client socket is run from the command line with the directory set to the location of the compiled Java program (Client.class). An example of the command line is: c:\some_directory>java Client "QUERY READTHIS" where java starts the Java virtual machine, Client is the name of the java program to run, and "QUERY READTHIS" is the argument passed to the java program that is in turn sent as a message to the SAS socket server. [note 2] The client will make 10 tries to connect to the server with a one second rest between tries. This loop begins with an attempt to connect to the socket server at address "localhost" and port 5100. If a connection is successful, as indicated the object "client" being non-null, the loop is exited. If unsuccessful, control passes to the catch block that increments and examines the loop counter. If the loop counter exceeds 10 the program ends with an error message. Otherwise, it sleeps for one second and returns to the top of the loop. [note 3] After creating inSoc and outSoc objects to read and write to the socket, the program sends the arguments from the command line to the socket server. Only the one line is transmitted to the server. [note 4] The client begins a read loop that will read as many lines as are sent back from the server. Each line is echoed to the DOS window. When the server has finished sending lines it will close the socket connection. The client closes the connection on its end and finishes.

RUNTHISPROGRAM.SAS LISTING The runthisprogram.sas listing is shown below.

/*---------------------------------------------------------------------------*/ /* ...socketDemo\runthisprogram.sas */ /* */ /* Test Base SAS server communicating to Java sockets. */ /* Write a file named c:\datasets\code\socketDemo\readthisfile.txt */ /* */ /*---------------------------------------------------------------------------*/ options source notes; /* change this file path for your system. filename out "c:\datasets\code\socketDemo\readthisfile.txt" ; data _null_; dt= datetime(); file out; put "datetime: " dt put "This is line 2 put "This is line 3 put "This is line 4 run;

datetime18.; in the file 'readthisfile.'"; in the file 'readthisfile.'"; in the file 'readthisfile.'";

9

*/

CONCLUSION The SAS socket server allows programs written in any modern programming language to invoke SAS jobs at remote locations. It also allows files of any type to be recovered from the remote location. This permits more effective resource utilization and access to computers with special characteristics. It also allows non-SAS packages to have access to SAS programs and data. One example is scheduler/controllers and workflow systems that can orchestrate a collection of SAS jobs and data for production or data warehouse ETL (extract, transform, and load). Below is a figure showing a workflow manager composed of several Java modules controlling a SAS production run through a socket server at a remote location.

DailyReports

Local Java Java Job_A

Java Job_B

Java Job _C

SAS Socket Server

Remote SAS SAS_A [Diskspace]

SAS_B [CPU]

SAS_C [DiskSpace] [CPU] [File_B]

A local controller/scheduler written in Java orchestrating SAS production at a remote location.

10

CONTACT INFORMATION Anthony M. Dymond, Ph.D. Dymond and Associates, LLC 4417 Catalpa Ct. Concord, California 94521 (925) 798-0129 (925) 680-1312 (FAX) [email protected] http://www.dymondassoc.com

Automated Consultant is a registered trademark of Dymond and Associates, LLC. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.

11