Linux Blog

P2C

Section: User Commands (1)
Updated: local
Index Return to Main Contents
 

NAME

p2c - Pascal to C translator, version 1.21alpha-07.Dec.93  

SYNOPSIS

p2c [ options ] [ file [ module ] ]  

DESCRIPTION

P2c is a tool for translating Pascal programs into C. The input consists of a set of source files in any of the following Pascal dialects: HP Pascal, Turbo/UCSD Pascal, DEC VAX Pascal, Oregon Software Pascal/2, Macintosh Programmer's Workshop Pascal, Sun/Berkeley Pascal, Texas Instruments Pascal, Apollo Domain Pascal. Modula-2 syntax is also supported. Output is a set of .c and .h files that comprise an equivalent program in any of several dialects of C. Output code may be kept machine- and dialect-independent, or it may be targeted to a specific machine and compiler. Most reasonable Pascal programs are converted into fully functional C which will compile and run with no further modifications, although p2c sometimes chooses to generate readable code at the expense of absolute generality. P2c endeavors to insert notes and warning messages into the output code to point out areas which may require human intervention. Output code is arranged to be readable and efficient, and to make use of C idioms wherever possible. The main goal of the translation is to produce C files which are pleasant and "natural" enough to be acceptable as the new source files for a program. In a pinch, p2c will also serve as an ad hoc Pascal compiler. The p2cc(1) script makes it easy to use p2c as a compiler.

Code generated by p2c normally does not assume characters are signed or unsigned. Also, it assumes int is the same as either short or long but does not depend on which. However, if int is not the same as long it is best to use a modern C compiler which supports prototypes. Generated code does not require an ANSI-compatible compiler (unless ANSI-style code is requested), but it does use various ANSI-standard library routines.

All generated code includes the file <p2c/p2c.h> which in turn includes <stdio.h> and various other common resources. Also, many translated programs will need to be linked with the run-time library, typically -lp2c.

Given a file name, p2c reads from the specified file and outputs to a file with a .c suffix added or substituted. For example,

p2c myfile.pas

reads from myfile.pas to produce the file myfile.c. The input file may contain a Pascal main program or a single Pascal module (or "unit" in Turbo and UCSD Pascal nomenclature), or it may just contain a number of procedures and declarations. P2c is designed to work for correct input programs. That is, it will accept partial programs but may occasionally core dump if the input refers to undefined symbols.

If the input is a module, the translator will also produce a file module.h containing a translation of the module's interface section. The implementation section may be omitted in which case only the .h file will be interesting. If the program or module has include files, these may cause additional .c files to be generated depending on the value of the ExpandIncludes option (see below).

If no file name is given, p2c reads Pascal from the standard input and writes the resulting C to standard output (though a .h file may still be produced). If a file name and module name are given, the file may include several modules (or units). The specified module is translated; any others are skipped. The output files will be named module.c and module.h. P2c never translates more than one module per run.

Before starting, p2c reads the file --HOMEDIR--/p2crc for a number of configuration parameters. (The actual path used on your system may vary. The -i option is a handy way to examine this file.) If the P2CRC environment variable is set, it gives the name of a file to read instead of the system file; this file can start with Include %H/p2crc to include the system file. Next, p2c attempts to read the file p2crc in your directory for further configuration. If this file does not exist, p2c looks for .p2crc instead.  

OPTIONS

-o cfile
Use cfile in place of file.c or module.c as the primary output file. A single dash (`-o -') says to write the C code to the standard output.
-h hfile
Use hfile in place of module.h as the output file for interface text. This only has effect if the input is an HP Pascal module or a Turbo Pascal unit.
-s sfile
Read interface text from sfile before beginning the translation. This file typically contains one or more modules, often with implementation sections omitted for speed, which the program or module being translated will use. (Typically the ImportFrom and ImportDir parameters in p2crc are set up to allow p2c to locate interface text without needing any -s options.) If there are several -s options in the command, the sfiles are read from left to right.
-pn
Display progress of translation in the form of a line number/file name display. This is refreshed every n lines, 25 by default.
-c rcfile
Read local configuration commands from rcfile instead of p2crc or .p2crc. A dash (`-c -') in place of rcfile causes no local configuration file to be used.
-v
("Vanilla.") Do not read from the system configuration file --HOMEDIR--/p2crc. Since some of the parameters in this file are required, your local configuration file must include those parameters instead. This also suppresses the file named by the P2CRC environment variable.
-H homedir
Use homedir instead of --HOMEDIR-- as the p2c home directory. The system p2crc file will be searched for in this directory.
-Ipattern
Add pattern to the ImportDir search list of places to find modules which are imported. The pattern should include a %s to represent the module name, and should evaluate to a potential file name for that module's source code. For example, ../%s.pas looks for modulename.pas in the parent of the current directory.
-i
This special option (which must be the only argument on the command line if used) simply copies the system configuration file --HOMEDIR--/p2crc to the standard output in its entirety. (It may be used with -H, but -i is most useful precisely when you don't know the location of the home directory.)
-q
Quiet mode. Suppresses output of status messages during translation.
-En
Abort translation after n errors. If n is omitted it defaults to zero, which means unlimited errors are allowed. Use -E1 to make p2c halt after the first error.
-e
Echo the Pascal source into the output file, surrounded by #ifdefs. This is the same as the CopySource parameter in the p2crc file.
-a
Produce modern ANSI C. This is a convenient override for the AnsiC parameter in the p2crc file.
-L language
Select input language name, such as VAX or TURBO. This is a convenient override for the Language parameter.
-V
Verbose mode. This causes p2c to generate an additional ".log" file with further details of the translation, such as a list of warnings and notes including those which are suppressed in the regular output.
-comp
Compiler mode. This switch tells p2c to use various configuration defaults that are more suitable for use as a Pascal compiler rather than a translator. It is the same as specifying the following options in your p2crc file:
     ElimDeadCode 0
     AnalyzeFlow 0
     MaxLineBreakTies 0
     FoldConstants 1
     FoldStrConstants 1
     OffsetForLoops 0
     StaticLinks 1
     BitwiseMod 0
     BitwiseDiv 0
     AssumeBits 0
     AssumeSigns 0
     FormatStrings 1
     StructFiles 1
     FullStrWrite 1
The p2cc script specifies this option when it runs p2c to compile a Pascal program.
-local
Local settings. This switch uses various configuration defaults that are appropriate if the code generated by p2c is going to be compiled and run on the same machine that ran p2c itself.
-check
Enable all error checking. Normally, some error checks are off by default, as described in the comments in the system p2crc file.
-M0
Disable memory conservation. This prevents p2c from freeing various data structures after translating each function, in case this new conservation feature causes unforseen problems.
-R
Regression testing mode. Formats notes and warning messages in a way that makes it easier to run diff(1) on the output of p2c.

P2c also understands a few debugging options which may occasionally be useful when tracking down translation problems. The -dn option sets the "debug level" to n, a small integer which is normally zero. Debugging output is written into the regular output file along with the C code; the higher your n, the more "wallpaper" you get. Also, -t prints debugging information at every Pascal token, -Bn enables line-breaker debugging, -Cn enables comment placement debugging, and -Fn enables flow-analysis debugging.  

CHOICE OF SOURCE LANGUAGE

The Language configuration parameter or -L command-line option tells p2c which Pascal dialect to expect in the input file. Any language features which do not overlap between dialects are supported all of the time. The Language parameter is consulted when a syntax or usage is detected that has different meanings in two different dialects, and also to determine default values for various other translation parameters as described below.

The following language words are supported by p2c. Names are case-insensitive.

HP
HP Pascal. This is the default language. All features of HP Standard Pascal, the Pascal Workstation version, are supported except as noted in BUGS below. Some features of MODCAL, HP's extended Pascal, are also supported. This is a superset of ISO standard Pascal, including conformant arrays and procedural parameters.
HP-UX
HP Pascal, HP-UX version. Almost identical to the "HP" dialect.
Turbo
Turbo Pascal 5.0 for the IBM PC. Few conflicts with HP Pascal, so the Language parameter is not often needed for Turbo. (Most important is that the Turbo and HP dialects use 16 and 32 bit integers, respectively.)
UCSD
UCSD Pascal. Similar to Turbo in many ways.
MPW
Macintosh Programmer's Workshop Pascal 2.0. Should also do a pretty good job for Lightspeed Pascal. Object Pascal features are not supported, nor is the fact that char variables are sometimes stored in 16 bits.
VAX
VAX/VMS Pascal version 3.5. Most but not all language features supported. This has not yet been tested on large programs.
Oregon
Oregon Software Pascal/2. All features implemented.
Berk
Berkeley Pascal with Sun extensions.
TIP
Texas Instruments Pascal.
Apollo
Apollo Domain Pascal.
Modula
Modula-2. Based on Wirth's Programming in Modula-2, 3rd edition. Proper setting of the Language parameter is not optional. Translation will be incomplete in most cases, but should be good enough to work with. Structure of local sub-modules is essentially ignored; like-named identifiers may be confused. Type WORD is translated as an integer, but type ADDRESS is translated as char * or void *; this may cause inconsistencies in the output code.
Modula-2 modules have two parts in separate files. Suppose these are called foo.def (definition part) and foo.mod (implementation part) for module foo. Then a pattern like %s.def must be included in the ImportDir list, and LibraryFile must be changed to refer to system.m2 instead of system.imp. To translate the definition part, give the command
     p2c foo.def
to translate the definition part into files foo.h and foo.c; the latter will usually be empty. The command
     p2c -s foo.def foo.mod
will translate the implementation part into file foo.c.

Even if all language features are supported for a dialect, some predefined functions may be omitted. In these cases, the function call will be translated literally into C with a warning. Some hand modification may be required.  

CONFIGURATION PARAMETERS

P2c is highly configurable. The defaults are suitable for most applications, but customizing these parameters will help you get the best possible translation. Since the output of p2c is intended to be used as human-maintainable source code, there are many parameters for describing the coding style and conventions you prefer. Others give hints about your program that help p2c to generate more correct, efficient, or readable code.

The p2crc files contain a list of parameters, one per line. The system configuration file, which may be viewed using the -i option to p2c, serves as an example of the proper format. Parameter names are case-insensitive. If a parameter name occurs exactly once in the system p2crc, this indicates that it must have a unique value and the last value given to it by the configuration files is used. Other parameters are written several times in a row; these are lists to which each configuration line adds an entry.

Many p2crc options take a numeric value of 0 or 1, roughly corresponding to "no" or "yes." Sometimes a blank value or the value "def" corresponds to an intermediate "maybe" state. For example, the stylistic option ExtraParens switches between copious or minimal parentheses in expressions, with the default being a nice compromise intended to be best for readers with an average knowledge of C operator precedences.

Configuration options may also be embedded in the source file in the form of Pascal comments:

     {ShortOpt=0} {AvoidName=fred}
     {FuncMacro slope(x,y)=atan2(y,x)*RadDeg}

disables automatic short-circuiting of and and or expressions, adds "fred" to the list of names to avoid using in generated C code, and defines a special translation for the Pascal program's slope function using the standard C atan2 function and a constant RadDeg presumably defined in the program. Whitespace is generally not allowed in embedded parameters. The `=' sign is required for embedded parameters, though it is optional in p2crc files. Comments within embedded parameters are delimited by `##'. Numeric parameters may replace `=' with `+' or `-' to increase or decrease the parameter; list-based parameters may use `-' to remove a name from a list rather than adding it. Also, the parameter name by itself in comment braces means to restore the parameter's value that was current before the last change:

     {VarFiles=0 ## Pass FILE *'s params by value even if VAR}
     some declarations
     {VarFiles ## Back to original FILE * passing}

causes the parameter VarFiles to have the value 0 for those few declarations, without affecting the parameter's value elsewhere in the file.

If an embedded parameter appears in an include file or in interface text for a module, the effect of the assignment normally carries over to any programs that included that file. If the parameter name is preceded by a `*', then the assignment is automatically undone after the source file that contains it ends:

     {IncludeFrom strings=<p2c/strings.h>}
     {*ExportSymbol=pascal_%s}
     module strings;

will record the location of the strings module's include file for the rest of the translation, but the assignment of ExportSymbol pertains only to the module itself.

For the complete list of p2crc parameters, run p2c with the -i option. Here are some additional comments on selected parameters:

ImportAll
Because Turbo Pascal only allows one unit per source file, p2c normally stops reading past the word implementation in a file being scanned for interface text. But HP Pascal allows several modules per file and so this would not be safe to do. The ImportAll option lets you override the default behavior for your Pascal dialect.
AnsiC
This parameter selects which dialect of C to use. If 1, all conventions of ANSI C such as prototypes, void * pointers, etc. are used. If 0, only strict K&R (first edition) C is used. The default is to use "traditional UNIX C," which includes enum and void but not void * or prototypes. Once again there are a number of other parameters which may be used to control the individual features if just setting AnsiC is not enough.
C++
This tells p2c to use a number of language extensions present in C++: Specifically, it enables the "//" format for comments, use of "anonymous unions" for variant records, use of declarations within the function body, use of references for VAR parameters, and use of "new" and "delete" instead of "malloc" and "free". P2c will check for collisions with C++ reserved words unless you explicitly set the C++ option to zero.
TurboObjects
P2c recognizes two major dialects of object-oriented Pascal. Turbo Pascal 6.0 object types translate fairly directly into C++ classes. In Apple's Object Pascal, the object type has similar syntax but represents a handle (a double pointer) to an object rather than an object itself. The TurboObjects option (whose default is determined by the Language setting) says whether objects should be direct or indirect through pointers. (P2c uses pointers instead of handles; p2c is most often used to make programs more portable, and few systems except the Mac use handles in this way.)
UseVExtern
Many non-UNIX linkers prohibit variables from being defined (not declared) by more than one source file. One module must declare, e.g., "int foo;", and all others must declare "extern int foo;". P2c accomplishes this by declaring public variables "vextern" in header files, and arranging for the macro vextern to expand to extern or to nothing when appropriate. If you set UseVExtern=0 p2c will instead declare variables in a simpler way that works only on UNIX-style linkers.
UseAnyptrMacros
Certain C reserved words have meanings which may vary from one C implementation to another. P2c uses special capitalized names for these words; these names are defined as macros in the file p2c.h which all translated programs include. You can set UseAnyptrMacros=0 to disable the use of these macros. Note that the functions of many of these macros can also be had directly using other parameters; for example, UseConsts allows you to specify whether your target language recognizes the word const in constant declarations. The default is to use the Const macro instead, so that your code will be portable to either kind of implementation.
Signed expands to the reserved word signed if that word is available, otherwise it is given a null definition. Similarly, Const expands to const if that feature is available. The words Volatile and Register are also defined in p2c.h, although p2c does not use them at present. The word Char expands to char by default, but might need to be redefined to signed char or unsigned char in a particular implementation. This is used for the Pascal character type; lowercase char is used when the desired meaning is "byte," not "character."
The word Static always expands to static by default. This is used in situations where a function or variable is declared static to make it local to the source file; lowercase static is used for static local variables. Thus you can redefine Static to be null if you want to force private names to be public for purposes of debugging.
The word Void expands to void in all cases; it is used when declaring a function with no return value. The word Anyptr is a typedef for void * or char * as necessary; it represents a generic pointer.
UsePPMacros
The p2c.h header also declares two macros for function prototyping, PP(x) and PV(). These macros are used as follows:
     Void foo PP( (int x, int y, Char *z) );
     Char *bar PV( );
If prototypes are available, these macros will expand to
     Void foo (int x, int y, Char *z);
     Char *bar (void);
but if only old-style declarations are supported, you instead get
     Void foo ();
     Char *bar ();
By default, p2c uses these macros for all function declarations, but function definitions are written in old-style C. The UsePPMacros parameter can be set to 0 to disable all use of PP and PV, or it can be set to 1 to use the macros even when defining a function. (This is accomplished by preceding each old-style definition with a PP-style declaration.) If you know your code will always be compiled on systems that support prototyping, it is prettier to set Prototypes=1 or simply AnsiC=1 to get true function prototypes.
EatNotes
Notes and warning messages containing any of these strings as sub-strings are not emitted. Each type of message includes an identifier like [145]; you can add this identifier to the EatNotes list to suppress that message. Another useful form is to use a variable name or other identifier to suppress warnings about that variable. The strings are a space-separated list, and thus may not contain embedded spaces. To suppress notes around a section of code, use, e.g., {EatNotes+[145]} and {EatNotes-[145]}. Most notes are generated during parsing, but to suppress those generated during output the string may need to remain in the list far beyond the point where it appears to be generated. Use the string "1" or "0" to disable or enable all notes, respectively.
ExpandIncludes
The default action is to expand Pascal include files in-line. This may not be desirable if include files are being used to simulate modules. With ExpandIncludes=0, p2c attempts to convert include files containing only whole procedures and global declarations into analogous C include files. This may not always work, though; if you get error messages, don't use this option. By combining this option with StaticFunctions=0, then doing some fairly minor editing on the result, you can convert a pseudo-modular Pascal program into a truly modular collection of C source files.
ElimDeadCode
Some transformations that p2c does on the program may result in unreachable or "dead" code. By default p2c removes such code, but sometimes it removes more than it should. If you have "if false" segments which you wish to retain in C, you may have to set ElimDeadCode=0.
AnalyzeFlow
By default p2c does some basic dataflow analysis on the program in an attempt to locate code that can be simplified due to knowledge about the possible values of certain variables. For example, a Pascal rewrite statement must translate to an if that either calls fopen on a formerly closed file variable, or freopen on an already-open file. If flow analysis can prove that the file was open or closed upon entry to the statement, a much cleaner translation is possible.
It is possible that flow analysis will make simplifications that are undesirable or buggy. If this occurs, you can set AnalyzeFlow to 0 to disable this feature.
SkipIndices
Normally Pascal arrays not based at zero are "shifted" down for C, preserving the total size of the array. A Pascal array a[2..10] is translated to a C array a[9] with references like "a[i]" changed to "a[i-2]" everywhere. If SkipIndices is set to a value of 2 or higher, this array would instead be translated to a[11] with the first two elements never used. This arrangement may generate incorrect code, though, for tricky source programs.
FoldConstants
Pascal non-structured constants generally translate to #define's in C. Set this to 1 to have constants instantiated directly into the code. This may be turned on or off around specific constant declarations. Set this to 0 to force p2c to make absolutely no assumptions about the constant's value in generated code, so that you can change the constant later in the C code without invalidating the translation. The default is to allow p2c to take advantage of its knowledge of a constant's value, such as by generating code that assumes the constant is positive.
CharConsts
This governs whether single-character string literals in Pascal const declarations should be interpreted as characters or strings. In other words, const a='x'; will translate to #define a 'x' if CharConsts=1 (the default), or to #define a x if CharConsts=0. Note that if p2c guesses wrong, the generated code will not be wrong, just uglier. For example, if a is written as a character constant but it turns out to be used as a string, p2c will have to write char-to-string conversion code each time the constant is used.
PreserveTypes
P2c makes an attempt to retain the original names used for data types. For example,
     type foo = integer; bar = integer;
establishes two synonyms for the standard integer type; p2c does its best to preserve the particular synonym that was used to declare each integer variable. Because the Pascal language treats these types as indistinguishable, there will be cases in the translation where p2c must fall back on the "true" type, int. PreserveTypes and a few related options control whether various kinds of type names are preserved. The default settings preserve all type names except for pointer types, which use "*" notation throught the program. This reflects the fact that Pascal forces pointer types to be named when traditionally they are not separately named in C.
VarStrings
In HP Pascal, a parameter of the form "var s : string" will match a string variable of any size; a hidden size parameter is passed which may be accessed by the Pascal strmax function. You can prevent p2c from creating a hidden size parameter by setting VarStrings=0. (Note that each function uses the value of VarStrings as of the first declaration of the function that is parsed, which is often in the interface section of a module.)
Prototypes
Control whether ANSI C function prototypes are used. Default is according to AnsiC or C++. This also controls whether to include parameter names or just their types in situations where names are optional. The FullPrototyping parameter allows prototypes to be generated for declarations but not for definitions (older versions of Lightspeed C required this). If you use a mixture of prototypes and old-style definitions, types like short and float will be promoted to int and double as required by the ANSI standard, unless PromoteArgs is used to override this. The CastArgs parameter controls whether type-casts are used in function arguments; by default they are used only if prototypes are not available.
StaticLinks
HP Pascal and Turbo Pascal each include the concept of procedure or function pointers, though with somewhat different syntaxes. P2c recognizes both notational styles. Another difference is that HP's procedure pointers can point to nested procedures, while Turbo's can point only to global procedures. In HP Pascal a procedure pointer must be stored as a struct containing both a pure C function pointer and a "static link," a pointer to the parent procedure's locals. (The static link is NULL for global procedures.) This notation can be forced by setting StaticLinks=1. In Turbo, the default (StaticLinks=0) is to use plain C function pointers with no static links. A third option (StaticLinks=2) uses structures with static links, but assumes the links are always NULL when calling through a pointer (if you need compatibility with the HP format but know your procedures are global).
SmallSetConst
Pascal sets are translated into one of two formats, depending on the size of the set. If all elements have ordinal values in the range 0..31, the set is translated as a single integer variable using bit operations. (The SetBits parameter may be used to change the upper limit of 31.) The SmallSetConst parameter controls whether these small-sets are used, and, if so, how constant sets should be represented in C. Forlargersets,anarrayof long is used. The s[0] element contains the number of succeeding array elements which are in use. Set elements in the range 0..31 are stored in the s[1] array element, and so on. Sets are normalized so that s[s[0]] is nonzero for any nonempty set. The standard run-time library includes all the necessary procedures for operating on sets.
ReturnValueName
This is one of many "naming conventions" parameters. Most of these take the form of a printf-like string containing a %s where the relevant information should go. In the case of ReturnValueName, the %s refers to a function name and the resulting string gives the name of the variable to use to hold the function's return value. Such a variable will be made if a function contains assignments to its return value buried within the body, so that return statements cannot conveniently be used. Some parameters (ReturnValueName included) do not require the %s to be present in the format string; for example, the standard p2crc file stores every function's return value in a variable called Result.
AlternateName
P2c normally translates Pascal names into C names verbatim, but occasionally this is not possible. A Pascal name may be a C reserved word or traditional C name like putc, or there may be several like-named things that are hidden from each other by Pascal's scoping rules but must be global in C. In these situations p2c uses the parameter AlternateName1 to generate an alternative name for the symbol. The default is to add an underscore to the name. There is also an AlternateName2 parameter for a second alternate name, and an AlternateName parameter for the nth alternate name. (The value for this parameter should include both a %s and a %d, in either order.) If these latter parameters are not defined, p2c applies AlternateName1 many times over.
ExportSymbol
Symbols in the interface section for a Pascal module are formatted according to the value of ExportSymbol, if any. It is not uncommon to use modulename_%s for this symbol; the default is %s, i.e., no special treatment for exported symbols. If you also define the Export_Symbol parameter, that format is used instead for exported symbols which contain an underscore character. If %S (with a capital "S") appears in the format string it stands for the current module name.
Alias
If the value of this parameter contains a %s, it is a format string applied to the names of external functions or variables. If the value does not contain a %s, it becomes the name of the next external symbol which is declared (after which the parameter is cleared).
Synonym
This creates a synonym for another Pascal symbol or keyword. The format is
     Synonym old-name = new-name
All occurrences of old-name in the input text are treated as if they were new-name by the parser. If new-name is a keyword, old-name will be an equivalent keyword. If new-name is the name of a predefined function, old-name will behave in the same way as that function, and so on. If new-name is omitted, then occurrences of old-name are entirely ignored in the input file. Synonyms allow you to skip over a keyword in your dialect of Pascal that is not understood by p2c, or to simulate a keyword or predefined identifier of your dialect with a similar one that p2c recognizes. Note that all predefined functions are available at all times; if you have a library routine that behaves like, e.g., Turbo Pascal's getmem procedure, you can make your routine a synonym for getmem even if you are not translating in Turbo mode.
NameOf
This defines the name to use in C for a specific symbol. It must appear before the symbol is declared in the Pascal code; it is usually placed in the local p2crc file for the project. The format is
     NameOf pascal-name = C-name
By default, Pascal names map directly onto C names with no change (except for the various kinds of formatting outlined above). If the pascal-name is of the form module.name or procedure.name then the command applies only to the instance of the Pascal name that is global to that module, or local to that procedure. Otherwise, it applies to all usages of the name.
VarMacro
This is analogous to NameOf, but specifically for use with Pascal variables. The righthand side can be most any C expression; all references to the variable are expanded into that C expression. Names used in the C expression are taken verbatim. There is also a ConstMacro parameter for translating constants as arbitrary expressions. Note that the variable on the lefthand side must actually be declared in the program or in a module that it uses. The declaration for the variable will be omitted from the generated code unless the Pascal-name appears in the expression: If you ask to replace i with i+1, the variable i will still be declared but its value will be shifted accordingly. Note that if i appears on the lefthand side of an assignment, p2c will use algebra to "solve" for i.
In all cases where p2c parses C expressions, all C operators are recognized except compound assignments like `+='. (Increment and decrement operators are allowed.) All variable and function names are assumed to have integer type, even if they are names that occur in the actual program. A type-specification operator `::' has been introduced; it has the same precedence as `.' or `->' but the righthand side must be a Pascal type identifier (built-in or defined by your program previously to when the macro definition was parsed), or an arbitrary Pascal type expression in parentheses. The lefthand argument is then considered to have the specified type. This may be necessary if your macro is used in situations where the exact type of the expression must be known (say, as the argument to a writeln).
FieldMacro
Here the lefthand side must have the form record.field, where record is the Pascal type or variable name for a record, and field is a field in that record. The righthand side must be a C expression generally including the name record. All instances of that name are replaced by the actual record being "dotted." For example,
     FieldMacro Rect.topLeft = topLeft(Rect)
translates a[i].topLeft into topLeft(a[i]), where a is an array of Rect.
FuncMacro
The lefthand side must be any Pascal function or procedure name plus a parameter list. The number of parameters must match the number in the function's uses and declaration. Calls to the function are replaced by the C expression on the righthand side. For example,
     FuncMacro PtInRect(p,r) = PtInRect(p,&r)
causes the second argument of PtInRect to be passed by reference, even though the declaration says it's not. If the function in question is actually defined in the program or module being translated, the FuncMacro will not affect the definition but it will affect all calls to the function elsewhere in the module. FuncMacros can also be applied to predefined or never-defined functions.
ReplaceBefore
This option specifies a string replacement to be done on every Pascal source line. For example:
     ReplaceBefore "{$ifdef" "{EMBED #ifdef"      ReplaceBefore "{$endif}" "{EMBED #endif}"
These lines rewrite Turbo Pascal compile-time conditionals into comments beginning with the special word EMBED. This word instructs p2c to format the rest of the comment without "/* */" delimiters, i.e., the rest of the comment is embedded directly in the output C program. There is also a ReplaceAfter option, which specifies replacements to be done on the output of p2c.
Currently, this feature makes only literal string replacements, not pattern-based matches. Some users of p2c have found it useful to feed their Pascal programs through a more powerful editor like sed or perl before giving them to p2c. Quite often this is all that is necessary to get an acceptable translation in the face of unrecognized Pascal dialects or language features.
IncludeFrom
This specifies that a given module's header should be included from a given place. The second argument may be surrounded by " " or < > as necessary; if the second argument is omitted, no include directive will be generated for the module.
ImportFrom
This specifies that a given module's Pascal interface text can be found in the given file. The named file should be either the source file for the module, or a specially prepared file with the implementation section removed for speed. If no ImportFrom entry is found for a module, the path defined by the ImportDir list is searched. Each entry in the path may contain a %s, which expands to the name of the module. The default path looks for %s.pas and %s.text in the current directory, then for --HOMEDIR--/%s.imp. (where --HOMEDIR-- is the p2c home directory.)
StructFunction
This parameter is a list of functions which follow the p2c semantics for structure-valued functions (functions returning arrays, sets, and strings, and structs in primitive C dialects). For these functions, a pointer to a return-value area is passed to the function as a special first parameter. The function stores the result in this area, then returns a copy of the pointer. (The standard C function strcpy is an example of this concept. Sprintf also behaves this way in some dialects; it always appears on the StructFunction list regardless of the type of implementation.) The system configuration file includes a list of common structured functions so that p2c's optimizer will know how to manipulate them.
StrlapFunction
Functions on this list are structured functions as above, but with the ability to work in-place; that is, the same pointer may be passed as both the return value area and a regular parameter.
Deterministic
Functions on this list have no side effects or side dependencies. An example is the sin function in the standard math library; two calls with the same parameter values produce the same result, and have no effects other than returning a value. P2c can make use of this knowledge when optimizing code for efficiency or readability. Functions on this list are also assumed to be relatively fast, so that it is acceptable to duplicate a call to the function.
LeaveAlone
Functions on this list are not subjected to the normal built-in translation rules that p2c would otherwise use. For example, adding writeln to this list would translate writeln statements blindly into calls to a C writeln() function, rather than being translated into equivalent printf calls. The built-in translation is also suppressed if the function has a FuncMacro.
BufferedFile
P2c normally assumes binary files will use read/write, not get/put/^ notation. A file buffer variable will only be created for a file if buffer notation is used for it. For global file variables this may be detected too late (a declaration without buffers may already have been written). Such files can be listed in BufferedFile to force p2c to allocate buffers for them; do this if you get a warning message that says it is necessary. Set BufferedFile=1 to buffer all files, in which case UnBufferedFile allows you to force certain files not to have buffers.
StructFiles
If p2c still can't translate your file operations correctly, you can set StructFiles=1 to cause Pascal files to translate into structs which include the usual C FILE pointer, as well as file buffer and file name fields. While the resulting code doesn't look as much like native C, the file structs will allow p2c to do a correct translation in many more cases.
CheckFileEOF
Normally only file-open operations are checked for errors. Additional error checking, such as read-past-end-of-file, can be enabled with parameters like CheckFileEOF. These checks can make the code very ugly! If I/O checking is enabled by the program ($iocheck on$ in HP Pascal; {$I+} in Turbo; this is always the default state), these checks will generate fatal errors unless enclosed in an HP Pascal try-recover construct. If I/O checking is disabled, these will cause the global variable P_ioresult to be set zero or nonzero according to the outcome. The default for most of these options is to check only when I/O checking is enabled.
 

ISSUES

Integer size. P2c normally generates code to work with either 16 or 32 bit ints. If you know your C integers will be 16 or 32 bits, set IntSize appropriately. In particular setting IntSize=32 will generate much cleaner code: p2c no longer must carefully cast function arguments between int and long. These casts also will be unnecessary if ANSI prototypes are available. To disable int/long casting because you know at least one of these cases will hold, set CastLongArgs=0. (The CastArgs parameter similarly controls other types of casts, such as between ints and doubles.) The Integer16 parameter controls whether Pascal integers are interpreted as 16 or 32 bits, or translated as native C integers. The default value depends on the Language selected.

Signed/unsigned chars. Pascal characters are normally "weakly" interpreted as unsigned; this is controlled by UnsignedChar. The default is "either," so that C's native char type may be used even if its signed-ness is unknown. Code that uses characters outside of the range 0-127 may need a different setting. Alternatively, you can use the types {SIGNED} char and {UNSIGNED} char in the few cases where it really matters. These comments are controlled by the SignedComment and UnsignedComment parameters. (The type {UNSIGNED} integer is also recognized.) The SignedChar parameter tells whether C characters are signed or unsigned (default is "unknown"). The HasSignedChar parameter tells whether the phrase "signed char" is legal in the output. If it is not, p2c may have to translate Pascal signed bytes into C shorts.

Special types. P2c understands the following predefined Pascal type names: integer, signed integers depending on Integer16; longint, signed 32-bit integers; unsigned, unsigned 32-bit integers; sword, signed 16-bit integers; word, unsigned 16-bit integers; c_int, signed native C integers; c_uint, unsigned native C integers; sbyte, signed 8-bit integers; byte, unsigned 8-bit integers; real, floating-point numbers depending on DoubleReals; single, single-precision floats; longreal, double, and extended, double-precision floats; pointer and anyptr, generic pointers (assignment-compatible with any pointer type); string, generic string of length StringDefault (normally 255); also, the usual Pascal types char, boolean, and text. (If your Pascal uses different names for these concepts, the Synonym option will come in handy.)

Embedded code. It is possible to write a Pascal comment containing C code to be embedded into the output. See the descriptions of EmbedComment and its relatives in the system p2crc file. These techniques are helpful if you plan to do repeated translations of code that is still being maintained in Pascal. See the description of ReplaceBefore for an example use of embedded code.

Comments and blank lines. P2c collects the comments in a procedure into a list. All comments and statements are stamped with serial numbers which are used to reattach comments to statements even after code has been added, removed, or rearranged during translation. "Orphan" comments attached to statements that have been lost are attached to nearby statements or emitted at the end of the procedure. Blank lines are treated as a kind of comment, so p2c will also reproduce your usage of blank lines. If the comment mechanism goes awry, you can disable comments with EatComments or disable their being attached to code with SpitComments.

Indentation. P2c has a number of parameters to govern indentation of code. The default values produce the GNU Emacs standard indentation style, although p2c can do a better job since it knows more about the code it is indenting. Indentation works by applying "indentation deltas," which are either absolute numbers (which override the previous indentation), or signed relative numbers (which augment the previous indentation). A delta of "+0" specifies no change in indentation. All of the indentation options are described in the standard p2crc file.

Line breaking. P2c uses an algorithm similar to the TeX typesetter's paragraph formatter for breaking long statements into multiple lines. A "penalty" is assigned to various undesirable aspects of all possible line breaks; the "badness" of a set of line breaks is approximately the sum of all the penalties. Chief among these are serious penalties for overrunning the desired maximum line length (default 78 columns), an infinite penalty for overrunning the absolute maximum line length (default 90), and progressively greater penalties for breaking at operators deeply nested in expressions. Parameters such as OpBreakPenalty control the relative weights of various choices. BreakArith and its neighbors control whether the operator at a line break should be placed at the end of the previous line or at the beginning of the next. If you don't want any oversize lines, define MaxLineWidth=78.

Unlike TeX, p2c's line breaker must actually try all possible sets of break points. To avoid excessive computation, the total penalty contributed at each decision point must sum to a nonnegative value; negative values are clipped up to zero. This allows p2c to prune away obviously undesirable alternatives in advance. The MaxLineBreakTries parameter (default 5000) controls how many alternatives to try before giving up and using the best so far.

PASCAL_MAIN. P2c generates a call to this function at the front of the main program. In the (unmodified) run-time library all this does is save argc and argv away because in both HP and Turbo these are accessed as global variables. If you do not wish to use this feature, define ArgCName to be argc, ArgVName to be argv, and MainName (normally "PASCAL_MAIN") to be blank. This will work if argc and argv are never accessed outside of your main program.  

BUGS

P2c was designed with the idea that clean, readable output in most cases is worth more than guaranteed correct output in extreme cases. P2c is not a compiler! However, ideally the "extreme" cases would include only those which never arise in real life. Thus if p2c actually generates incorrect code I will consider it a bug, but I will not apologize for it. :-) Below are the major remaining cases where this is known to occur.

Certain kinds of conformant array parameters (including multi-dimensional conformant arrays) produce code that declares variable-length arrays in C. Only a few C compilers, such as the GNU C compiler, support this language extension. Otherwise some hand re-coding will be required.

HP Pascal try-recover structures are translated into calls to TRY and RECOVER macros, which are defined to simulate the construct using setjmp and longjmp. If this emulation does not work, define the symbol FAKE_TRY to cause these macros to become "inert." (In cases where the error is detected by code physically within the body of the try statement, a C goto to the recover section is always generated.) Also, local file variables in scopes which are destroyed by an escape are not closed.

Non-local GOTO's and try-recover statements are each implemented, but may conflict if both are used at once. Non-local GOTO's are fairly careful about closing files that go out of scope but may fail to do so in the presence of recursion.

Arrays containing files are not initialized to NULL as other files are. In some cases, such as file variables allocated by NEW, the file is initialized but not automatically closed by DISPOSE.

LINK variables allowing sub-procedures access to their parents' variables are occasionally omitted by mistake, if the access is too indirect for p2c to notice. If this happens, you can add an explicit reference to a parent variable in the sub-procedure. A statement of the form "a:=a" will count as a reference but then be optimized away by p2c.

Many aspects of Modula-2 are translated only superficially. For example, the type-compatibility properties of the WORD and ARRAY OF WORD types are only roughly modelled, as are the scope rules concerning modules.

Parts of VAX Pascal are still untreated. In particular, the [UNSAFE] attribute and a few others are not fully supported, nor are the semantics of the OPEN procedure.

Turbo and VAX Pascal's double, quadruple, and extended real types all translate to the C double type. Turbo's computational type is not supported at all.

Because Pascal strings (with length bytes) are translated into C strings (with null terminators), certain Pascal string tricks will not work in the translated code. For example the assignment s[0]:=chr(x) is translated to s[x]=0 on the assumption that the string is being shortened. If x is actually greater than the current length, but not of a recognizable form like ord(s[0])+n, then the generated code will not work. In VAX Pascal this corresponds to performing arithmetic on the LENGTH field of a varying-length string.

Turbo Pascal's automatic clipping of strings is not supported. In Turbo, if a ten character string is assigned to a string[8] variable, the last two characters are silently removed. The code produced by p2c generally will overrun the target string instead! The StringTruncLimit parameter (80 by default if Language=Turbo) specifies a string size which should be considered "short"; assignments of potentially-long strings to short string variables will cause a warning but will not automatically truncate. The cure is to use copy in the Pascal source to truncate the strings explicitly.  

FILES

file.xxx         Pascal source files

file.c          resulting C source file

module.h        resulting C header file

p2crc           local configuration file

.p2crc          alternate local configuration file

--HOMEDIR--/p2crcsystem-wide configuration file

--HOMEDIR--/system.impdeclarations for predefined functions

--HOMEDIR--/system.m2analogous declarations for Modula-2

--HOMEDIR--/*.impinterface text for standard modules

--INCDIR--/p2c.hheader file for translated programs

--LIBDIR--/libp2c.arun-time library
 

AUTHOR

Dave Gillespie, daveg@synaptics.com.

Many thanks to William Bader, Steven Levi, Rick Koshi, Eric Raymond, Magne Haveraaen, Dirk Grunwald, David Barto, Paul Fisher, Tom Schneider, Dick Heijne, Guenther Sawitzki, and many others whose suggestions and bug reports have helped improve p2c in countless ways.


 

Index

NAME
SYNOPSIS
DESCRIPTION
OPTIONS
CHOICE OF SOURCE LANGUAGE
CONFIGURATION PARAMETERS
ISSUES
BUGS
FILES
AUTHOR




Random Man Pages:
outsw
primes
atlantis
getppid