MP-DIT Mathematical Program Data Interchange Tool

EINDHOVEN UNIVERSITY OF TECHNOLOGY Department of Mathematics and Computing Science Memorandum COS OR 92-31 MP-DIT Mathematical Program Data Intercha...
Author: Morgan Tyler
0 downloads 4 Views 454KB Size
EINDHOVEN UNIVERSITY OF TECHNOLOGY Department of Mathematics and Computing Science

Memorandum COS OR 92-31

MP-DIT Mathematical Program Data Interchange Tool M. Makowski M.W.P. Savelsbergh

Eindhoven, August 1992 The Netherlands

Eindhoven University of Technology Department of Mathematics and Computing Science Probability theory, statistics, operations research and systems theory P.O. Box 513 5600 MB Eindhoven - The Netherlands Secretariate: Dommelbuilding 0.03 Telephone: 040-47 3130

ISSN 0926 4493

MP-DIT Mathematical Program Data Interchange Tool Marek Makowski International Institute for Applied Systems Analysis 2361 Laxenburg Austria e-mail: [email protected] Martin W.P. Savelsbergh Eindhoven University of Technology P.O. Box 513 5600 MB Eindhoven The Netherlands e-mail: [email protected]

MP-DIT Mathematical Program Data Interchange Tool Marek Makowski International Institute for Applied Systems Analysis 2361 Laxenburg Austria Martin W.P. Savelsbergh Eindhoven University of Technology P.O. Box 513 5600 MB Eindhoven The Netherlands

1

Introduction

In a paper on modeling environments Geoffrion [1989] states that 'modeling environments have the potential to greatly increase the productivity of model-based work through better tools, to improve the quality of model-based work through better support for good modeling style and work practices, and to improve the frequency of use of MSjOR by bringing about a more comfortable working relationship between MSjOR professionals and their constituencies.' Geoffrion argues that 'facilities for good management of key resources, namely data, models, solvers, and knowledge derived from these' is one of the desired characteristics of a modeling environment and that 'software integration' is one of the major challenges for the future. The three major components of a modeling environment are a data base, see Welch [1987] for a discussion on the data management needs for mathematical programming applications, a modeling language, see Tomlin and Welch [1993] for a historic review of the developments in the area of matrix generators and modeling languages, and solvers. Over the years most research has been devoted to solvers, with a growing interest in modeling languages. The data base component, although maybe the most important from a practical point of view, has received the least attention from the mathematical programming community. Furthermore, since each of these fields is very specialized, components are often built by different groups. This makes software integration a huge challenge indeed. An important step towards software integration is the definition of standard interfaces between the components. One such interface exists: the MPS format. An MPS input file describes an instance of a mathematical programming model, an MPS output file describes a solution to an instance of a mathematical programming model, and an MPS advanced basis file describes an advanced starting point, associated with the instance described in the MPS input file, for the solution process. In fact, the only virtue of the MPS format is that it is a standard; all commercially available solvers support at least the MPS input file. However, it has many shortcomings as an interface: it is very inefficient, both in terms of time arid space, and it is inflexible with respect to data manipulation. Developing a large-scale linear programming model is an iterative process, because few modeling professionals ever get a model right the first time. Moreover, tuning a solver for a class of instances of a large-scale linear programming model is an iterative process as well, 1

because there is a variety of methods available, such as the simplex method, the affine scaling interior point method, the primal dual interior point method, and the primal dual predictor corrector interior point method, each with numerous parameters, and it is unclear at the start which of these methods will perform best on a specific class of instances. Envision a sophisticated environment for the development and solution of large-scale linear programming models. Such an environment may consist of a modeling language, such as GAMS [Brooke, Kendrick and Meeraus 1988], AMPL [Fourer, Gay and Kernighan 1990], or MODLER [Greenberg 1990a], one or more solvers, such as CPLEX [CPLEX 1990], OBI [Lustig, Marsten, and Shanno 1989, 1990], or OSL [IBM 1991], an interactive system to provide computer assisted analysis of model instances and their solution, such as ANALYZE [Greenberg 1990b], and a tool to generate random variations around a base model for statistical experimentation, such as RANDMOD [Greenberg 1990c]. In such an environment there is a lot of communication between the various components. MPS files are not very well suited to provide such a means of communication. They are inflexible and inefficient. The overhead of writing and reading a complete MPS file, even if a previously generated and solved instance is only slightly modified, is enormous. What is needed is a more flexible and efficient data interchange tool. Another disadvantage of the MPS format is that it was designed for the description of instances of linear programming models only. Extensions have been in use for mixed integer programming models, but a more sophisticated data interchange tool should also support other type of models, such as quadratic programming, nonlinear programming, and multiobjective linear programming models. In addition to that, a sophisticated data interchange tool may even provide capabilities for the specification of structural information such as block structure in multi-period or multi-commodity flow models. Most of the above observations and remarks are not new, but have been made before. Unfortunately, the MPS format is still the only available standard. In view of the growing importance of modeling environments and the associated challenge of software integration, the time seems right to attempt to design a viable alternative to the MPS format. This note presents some ideas on such an alternative and is intended to start a discussion among those interested in the subject.

2

Mathematical Program Data Interchange Tool (MP-DIT)

We propose, as a starting point for further discussion, an easily extendable and modifiable data interchange tool based on the class concept of C++ [Stroustrup 1986]. A class is a new data type, i.e., a concrete representation of a concept, such as an instance of a mathematical program. The fundamental idea in defining a new data type is to separate the details of implementation from the properties essential for the correct use of it. With a class, access to objects of a class can be restricted to a set of member functions declared as part of the class. One of the benefits is that a potential user of the data type need only examine the definition of the member functions to learn to use it. We propose to define a data type that stores and manipulates instances of mathematical programs, a data type that stores and manipulates solutions to instances of mathematical programs, and possibly a data type that stores and manipulates additional information. The basic member functions that should be provided for each of these data types are put. get. read, and write. The put provides an object of the class with data, get obtains data from

2

an object of the class, read gets data from a file for an object of the class, and write puts data in a file from an object of the class. The put and get functions are meant for on-line use of an object of the class and read and write for the off-line use of an object of the class. An obvious implementation of the read and write functions would be to read and write MPS files so that compatibility with the existing standard is maintained. Another advantageous feature of C++ is the idea of function overloading. Different functions typically have different names, but for functions performing similar tasks on different types of objects it is sometimes better to let these functions have the same name. In C++ it is possible to declare a function to be overloaded, i.e., it is possible to inform the compiler that the multiple use of a function name is intentional. Using this feature it would be easy to extend the class with functions tailored to specific applications. A developer of a modeling language and data base manager may provide a specialized put function that uses implementation details of the modeling system. To provide the necessary flexibility with respect to data manipulation, the class should provide various ways to access the instance data, such as row-wise access, column-wise access, and triplet-wise access. To provide the necessary efficiency when used in an interactive environment the class should provide various ways to store the instance data, such as storage in memory or storage on file. The envisioned data interchange tool can be implemented as a library of classes to be used by preprocessors, problem generators, solvers and report generators in order to provide efficient access to different sets of data processed by these modules. When used for the interchange of data between a problem generator and a solver, it should be able to handle the minimal set of data that defines an instance of a mathematical program as well as other information needed by a solver, such as different limits, options, and tolerances. When used for the interchange of data between a solver and a report generator, it should provide easy and flexible handling of a solution. Any distribution of MP-DIT, preferably as source code, should have a basic, but fully operational, version of all functions. Each application can then replace one or more of them with tailored versions.

3

Outline of MP-DIT classes

Below we outline the basic MP-DIT classes. We do not present specific implementation details, because at this point we are mainly interested in the functionality of the tool. This section is intended to give an idea of what we have in mind; the actual implementation will depend on assumptions still open for discussion. For presentational convenience, we have restricted ourselves to data interchange for linear programs. We like to emphasize again that the idea is to build an efficient, easy-to-use, and easy-toextend interface tool for modeling environments. MP-DIT can be enhanced and extended in several ways. Different data structures can be implemented for internal storage. The set of overloaded functions can be extended. New classes can be defined, for instance to support data interchange between a modeling language and report generator. There are several issues that, for the sake of brevity, are not dealt with in the following outline, but should be considered in real implementations in order to allow the use of the MP-

3

DIT classes with different applications and in different programming environments. Three of them are mentioned below. First, the use of typedef types of variables to deal with different compiler and hardware characteristics; for instance to be able to handle large problems in (MS-DOS) environments where (for many compilers) the int type has a size of two bytes. Second, the use of replaceable functions for memory management; in simple implementations only the operators new and delete will be used. Third, the use of replaceable functions for messages generated by MP-DITj in simple implementations message functions will write to stdout. On a different level, there are issues related to the control structure of the environment in which MP-DIT is used, i.e., the mechanisms by which the flow of execution of the various components can be specified. In a sophisticated modeling environment, for instance, there have to be mechanisms to notify both MP-DIT and the solvers (maybe via MP-DIT) that several slightly modified instances have to be solved consecutively for data modified upon analysis of obtained solutions. In such a situation, the information pertaining to the active instance should be kept for later use, even when a specific action is completed.

II II II

Constant definitions

#define VERSION #define DATELENGTH #define NAMELENGTH #define PROBLEMLENGTH #define #define #define #define

II II II

INF EPS MAXITER MAXTIME

*1

"0.00"

1*

MP-DIT version

26

1* 1* 1*

length of aSC11 formatted date length of column and row names length of problem name *1

1* 1* 1* 1*

infinity *1 epsilon *1 maximum number of iterations *1 maximum number of cpu seconds *1

8

20 (1. e+31) (1.e-6)

1000000 3600

*1 *1

Macro's

#define IERR \ faterr("Internal error in \"%s\" at line %d". __ FILE__ • __LINE__ )

II II II

Enumeration types

enum enum enum enum

TYPE ACCESS STORAGE STATUS

{MINIMIZATION. MAXIMIZATION}; {BYCOLS. BYROWS. BYTRIPLETS}; {MEMORY. DISK}; {SOLVED. UNSOLVED};

4

II II II

Structures related to instances of linear programs

typedef struct { char name[PROBLEMLENGTH+l]; int n; int m; long nz; TYPE type; ACCESS access; STORAGE storage; STATUS statusj } IHEADERj

1* 1* 1* 1* 1* 1* 1* 1*

typedef struct { double *objj double *lbrj double *ubr; char *sensej char **rname; double *lbcj double *ubcj char **cname; int *c; int *rj double *v; } IDATA;

1* cost row *1 1* row lower bounds *1 1* row upper bounds *1 1* row senses *1 1* row names *1 1* column lower bounds *1 1* column upper bounds *1 1* column names *1 1**1 1* Storage of coefficient matrix *1 1**1

problem name *1 number of columns *1 number of rows *1 number of nonzeros *1 minimization or maximization access type *1 storage type *1 status of instance *1

*1

typedef struct {

II II II II

Different data structure for the storage of the matrix, i.e., maybe using super sparsity structures

} IDATA2; typedef struct { double Ibr; double ubr; char sense; char *rnamej int nZj int *Cj double *Vj } IROW;

1* 1* 1* 1* 1* 1* 1*

lower bound *1 upper bound *1 sense *1 name *1 number of nonzero coefficients *1 column indices of nonzero coefficients values of nonzero coefficients *1

typedef struct { 5

*1

double obj; double lbr; double ubr; char *cname; int nz; int *r; double *v; } IeOL; typedef struct { short sprec; short iprec; short lprec; short dprec; long iter; long time; double zero; double feas; double *scaling; double deflbr; double defubr; double deflbc; double defubc;

II II II II

1* 1* 1* 1* 1* 1* 1*

objective coefficient *1 lower bound *1 upper bound *1 name *1 number of nonzero coefficients *1 row indices of nonzero coefficients values of nonzero coefficients *1

1* 1* 1* 1* 1* 1* 1* 1* 1* 1* 1* 1* 1*

short prec1s10n *1 integer precision *1 long precision *1 double precision *1 maximum number of iterations *1 maximum execution time *1 zero tolerance *1 feasibility tolerance *1 scaling coefficients *1 default lower bound for rows *1 default upper bound for rows *1 default lower bound for columns default upper bound for columns

This list may be extended by other commonly used parameters and tolerances

} INFO;

II II II

*1 *1

Structures related to solutions to instances of linear programs

typedef struct { char name[PROBLEMLENGTH+l]; int n; int m; long nz; TYPE type; STORAGE storage; STATUS status; } SHEADER; typedef struct { double zIp; double *xlp; double *rc;

1* 1* 1* 1* 1* 1* 1*

problem name *1 number of columns *1 number of rows *1 number of nonzeros *1 minimization or maximization storage type *1 status of instance *1

1* solution value *1 1* solution vector *1 1* reduced cost vector *1 6

*1

*1

double *pi; double *slack; int *rstat; int *cstat; } SDATA;

1* 1* 1* 1*

dual vector *1 slack vector *1 row status vector *1 column status vector

*1

II II Classes II class instance { char *version; 1* MP-DIT version *1 char *date; 1* date *1 FILE *fp_instance; 1* disk storage *1 IHEADER iheader; 1* instance header *1 IDATA idata; 1* instance data *1 INFO info; 1* instance info *1 public: instance (char *fname, STORAGE storage = DISK, ACCESS atype = BYCOLS); - instance 0; put (IHEADER *iheader); put (IDATA *idata); put (INFO *info); put (int ix, IROW *row); 1* put a single row *1 put (int ix, ICOL *col); 1* put a single column *1 get (IHEADER *iheader); get (IDATA *idata); get (int ix, IROW *row); 1* get a single row *1 get (int ix, ICOL *col); 1* get a single column *1 get (INFO *info); read (char *fname); 1* read an instance in MPS-format *1 write (char *fname); 1* write an instance in MPS format *1 };

class solution { char *version; 1* MP-DIT version *1 char *date; 1* date *1 FILE *fp_solution; 1* disk storage *1 SHEADER sheader; 1* solution header *1 SDATA sdata; 1* solution data *1 public: solution (char *fname, STORAGE storage = DISK); -solution 0; put (SHEADER *sheader); put (SDATA *sdata); get (SHEADER *sheader); get (SDATA *sdata); 7

read (char *fname); write (char *fname);

1* 1*

read a solution in MPS-format *1 write a solution in MPS-format *1

}

4

Conclusion

In this note, we have presented some ideas on a data interchange tool for mathematical programming that can be used in sophisticated modeling environments to link data bases, modeling languages, report generators, and solvers. We hope this note will serve as a basis for further discussion among those interested in the subject. In order to facilitate such a discussion the Methodology of Decision Analysis Project at IIASA is tentatively planning to organize a small workshop on data interchange tools for mathematical programming in the summer of 1993. We would like to invite researchers and software developers, interested in either developing such tools or in using them as an interface for their software. Obviously, the workshop will be organized only if a sufficient number of people are interested. The workshop should provide a useful forum for discussions and experiments that can result in improving the suggested MP-DIT specification and its implementations. For more information and other comments or suggestions please contact one of the authors.

5

References

A. BROOKE, D. KENDRICK, A. MEERAUS (1988). GAMS: A User's Guide. Scientific Press, Redwood City, CA. CPLEX OPTIMIZATION, INC. (1990). Using the CPLEX Linear Optimizer. R. FOURER, D.M. GAY, B.W. KERNIGHAN (1990). A modeling language for mathematical programming. Management Science 36, 519-554. A.M. GEOFFRION (1989). Computer-based modeling environments. Europ. J. Oper. Res. 41, 33-43. H.J. GREENBERG (1990a). A primer for MODLER: Modeling by Object-Driven Linear Elemental Relations. University of Colorado at Denver, Denver, CO. H.J. GREENBERG (1990b). A primer for ANALYZE: A Computer-Assisted Analysis System for Mathematical Programming Models and Solutions. University of Colorado at Denver, Denver, CO. H.J. GREENBERG (1990c). A primer for RANDMOD: A System for Modifications to an Instance of a Linear Program. University of Colorado at Denver, Denver, CO. INTERNATIONAL BUSINESS MACHINES CORP. (1991). Optimization Subroutine Library: Guide and Reference. I.J. LUSTIG, R.E. MARSTEN, D.F. SHAN NO (1989). Computational Experience with a Primal Dual Interior Point Method for Linear Programming. Technical Report SOR 89-17, Princeton University, Princeton, NJ. I.J. LUSTIG, R.E. MARSTEN, D.F. SHANNO (1990). On implementing Mehrotra's PredictorCorrector Interior Point Method for Linear Programming. Technical Report SOR 90-03, Princeton University, Princeton, N J. B. STROUSTRUP (1986). The C++ programming language. Addison Wesley, Reading, Massachusetts. 8

J.A. TOMLIN, J.S. WELCH JR. (1993). Mathematical Programming Systems. E. Coffman, J.K. Lenstra (ed.). Handbook of Opemtions Research and Management Science: Computation, North-Holland, Amsterdam, forthcoming. J.S. WELCH JR. (1987). The data management needs of mathematical programming applications. IMA J. of Mathematics in Management 1, 237-250.

9

List of CaSaR-memoranda - 1992 Number 92-01

Month January

Author F.W. Steutel

Title On the addition of log-convex functions and sequences

92-02

January

P. v.d. Laan

Selection constants for Uniform populations

92-03

February

E.E.M. v. Berkum H.N. Linssen D.A. Overdijk

Data reduction in statistical inference

92-04

February

H.J.C. Huijberts H. Nijmeijer

Strong dynamic input-output decoupling: from linearity to nonlinearity

92-05

March

S.J.L. v. Eijndhoven J .M. Soethoudt

Introduction to a behavioral approach of continuous-time systems

92-06

April

P.J. Zwietering E.H.L. Aarts J. Wessels

The minimal number of layers of a perceptron that sorts

92-07

April

F .P.A. Coolen

Maximum Imprecision Related to Intervals of Measures and Bayesian Inference with Conjugate Imprecise Prior Densities

92-08

May

I.J.B.F. Adan J. Wessels W.H.M. Zijm

A Note on "The effect of varying routing probability in two parallel queues with dynamic routing under a threshold-type scheduling"

92-09

May

I.J.B.F. Adan G.J.J.A.N. v. Houtum J. v.d. Wal

Upper and lower bounds for the waiting time in the symmetric shortest queue system

92-10

May

P. v.d. Laan

Subset Selection: Robustness and Imprecise Selection

92-11

May

R.J .M. Vaessens E.H.L. Aarts J.K. Lenstra

A Local Search Template (Extended Abstract)

92-12

May

F .P.A. Coolen

Elicitation of Expert Knowledge and Assessment of Imprecise Prior Densities for Lifetime Distributions

92-13

May

M.A. Peters A.A. Stoorvogel

Mixed

H2I H oo

Control in a Stochastic Framework

-2-

Number 92-14

Month June

Author P.J. Zwietering E.H.L. Aarts J. Wessels

Title The construction of minimal multi-layered perceptrons: a case study for sorting

92-15

June

P. van der Laan

Experiments: Design, Parametric and Nonparametric Analysis, and Selection

92-16

June

J.J .A.M. Brands F.W. Steutel R.J. G. Wilms

On the number of maxima in a discrete sample

92-17

June

S.J.L. v. Eijndhoven J.M. Soethoudt

Introduction to a behavioral approach of continuous-time systems part II

92-18

June

J .A. Hoogeveen H. Oosterhout S.L. van der Velde

New lower and upper bounds for scheduling around a small common due date

92-19

June

F.P.A. Coolen

On Bernoulli Probabilities

92-20

June

J.A. Hoogeveen S.L. van de Velde

Minimizing Total Inventory Cost on a Single Machine in Just-in-Time Manufacturing

92-21

June

J .A. Hoogeveen S.L. van de Velde

Polynomial-time algorithms for single-machine bicriteria scheduling

92-22

June

P. van der Laan

The best variety or an almost best one? A comparison of subset selection procedures

92-23

June

T.J.A. Storcken P.H.M. Ruys

Extensions of choice behaviour

92-24

July

L.C.G.J.M. Habets

Characteristic Sets m overview

92-25

July

P.J. Zwietering E.H.L. Aarts J. Wessels

Exact Classification With Two-Layered Perceptrons

92-26

July

M.W.P. Savelsbergh

Preprocessing and Probing Techniques for Mixed Integer Programming Problems

Experiments

with

Imprecise

Prior

Commutative Algebra:

an

-3Number 92-27

Month July

Author I.J.B.F. Adan W.A. van de Waarsenburg J. Wessels

Title Analysing EklErlc Queues

92-28

July

O.J. Boxma G.J. van Houtum

The compensation approach applied to a 2 X 2 switch

92-29

July

E.H.L. Aarts P.J.M. van Laarhoven J .K. Lenstra N.L.J. Ulder

Job Shop Scheduling by Local Search

92-30

August

G.A.P. Kindervater M.W.P. Savelsbergh

Local Search in Physical Distribution Management

92-31

August

M. Makowski M.W.P. Savelsbergh

MP-DIT Mathematical Program data Interchange Tool