PyjamaScheme
From IPRE Wiki
Pyjama Scheme is a new implementation of a Scheme-based language for .NET/Mono. It implements many core Scheme functions, but also adds some functionality to bring it into line with the other modern languages like Python and Ruby. The goal is to have a complete environment for teaching, called Pyjama.
Contents |
Background
Pyjama Scheme is actually written in Scheme. It is written in a special form (continuation-passing style) and then automatically converted through a series of transformations, into C# so that it can be run on .NET/Mono. The generated language could be any language, but for this project we chose Scheme (because we need a Scheme for Pyjama).
Pyjama Scheme is currently under development. Plans:
- get a basic scheme working in .NET/Mono [done]
- use the Dynamic Language Runtime (DLR) for environments (more on that below)
- integrate it with the rest of the DLR
Pyjama Scheme is not a regular scheme and is not meant to be compatible with others. It is designed for education and research.
Current limitations:
- The command-line in C# allows only single line entries
Extensions
- Modules - implements namespaces for imports and foreign objects
- Exceptions - implements try/catch/finally
Modules
Modules provide an easy method for structuring hierarchies of libraries and code.
Namespaces
Import
(import <string-exp>) (import <string-exp> '<symbol-exp>)
Examples
==> (import "my-file.ss")
my-file.ss is a Scheme program file, which itself could have imports.
==> (import "my-file.ss" 'my-stuff)
Loads the file, and puts it in the namespace "my-stuff" accessible through the lookup interface below.
Lookup
module.name module.module.name
==> (import "my-file.ss" 'my-stuff) ==> my-stuff.x 5
Directory Listing
==> (dir) (- * / + < = > and append apply caaaar caaadr caaar caadar caaddr caadr caar cadaar cadadr cadar caddar cadddr caddr cadr call/cc call-with-current-continuation car case cdaaar cdaadr cdaar cdadar cdaddr cdadr cdar cddaar cddadr cddar cdddar cddddr cdddr cddr cdr cond cons current-time debug dir display env eq? equal? eval exit float for-each format get group help import int iter? length let let* letrec list list? list->vector list-head list-tail load map member memq newline not null? or pair? parse print printf procedure? property range record-case reverse set-car! set-cdr! sort sqrt string? string<? string->list string->symbol symbol symbol->string typeof using vector vector? vector->list vector-ref vector-set! vector-set!) ==> (dir my-stuff) (x y z)
Iterators
Strings, vectors, and lists all work with map and for-each.
==> (map display "123") 123(void void void) ==> (for-each (lambda (v) (printf "~a\n")) (vector 1 2 3)) 1 2 3
Help/Doc System
When you define a variable, you can optionally add a doc-string:
==> (define x "This variable holds the sum" 0) ==> (define y 0) ==> (define function "This computes the polynomial..." (lambda (x) ...))
You can lookup the doc-string with:
==> (help 'x) This variable holds the sum
Command-line
You can run scheme code on the command line:
MONO_PATH=DLLs mono pjscheme.exe examples.ss file2.ss
Misc
get will lookup a symbol in the environment:
==> (get 'get) #<procedure>
typeof will give you the .NET type of a value:
==> (typeof 1) System.Int32
==> (typeof 238762372632732736) Microsoft.Scripting.Math.BigInteger
==> (typeof 1/5) Rational
Exceptions
(raise <exp>)
(try <exp>...) (try <exp>... (catch <sym> <exp>...)) (try <exp>... (finally <exp>...)) (try <exp>... (catch <sym> <exp>...) (finally <exp>...))
There are four main forms of try:
- try alone: has no effect---doesn't catch exceptions
- try with a catch-clause: if exception in try-body, catch-clause will catch and provide return value, otherwise body of try provides. If exception in catch-clause will simple raise it.
- try with a finally-clause: finally-clause will run if exception in try-body or not. If there is an exception in the finally, it will raise it.
- try with a catch-clause and finally-clause: if an exception in try-body, the catch-clause will run, followed by the finally-clause. If an exception is raised in the catch-clause too, then finally-clause will run, and then raise the catch-exception; if there is an exception raised in the
finally-clause, then it will raise.
The <sym> of the catch-clause will be bound to the expression raised.
Examples
==> > (try x) (uncaught exception: "unbound variable x") ==> "(try x (catch e e))" "unbound variable x" ==> "(try x (catch e (raise e)))" (uncaught exception: "unbound variable x") ==> "(try x (finally 'hi))" hi (uncaught exception: "unbound variable x") ==> "(try x (catch e 1 2 3 4 (finally 'hi))" hi 4
Bugs
- Defining a vector returns an uncaught exception and if we call the defined functions name, it returns the first element of vector. FIXED
- Bug in (map - '(1 2 3)). Error was in Subtract with one arg. FIXED
- (/ 5 1) should give an integer, but gives rational with 1 in denominator. FIXED
- (/ 5) should give 1/5 but gives an error. FIXED
- Append doesn't work for empty lists, and no error checking. FIXED
- Not sure whether is a bug: if (display '(don't try this at home)) returns (uncaught exception: "unexpected character ' encountered") (petite: (don (quote t) try this at home)) Error in reader-cps.ss FIXED
- Environment.cs -- is this file necessary? Why is it written in Scheme but labeled .cs? FIXED
- Bug in mapping printf in C#; example: (map printf '("hello")). FIXED
- No Error for incorrect number of arguments.
- PJScheme parser doesn't allow #3(1 2 3) only #(1 2 3)
- C# PJScheme doesn't work with #(1 2 3)
Features
- Semantics of dotted-names are complex:
- First we look up the name ("math.x.y.z") if value found, return it
- Next, we lookup "math", then "math.x", "math.x.y", and "math.x.y.z"
- If none of those are modules, then it is an error
- This makes it impossible to mix dotted names with modules. OK
- Changed the dir command to include macros (special forms from define-syntax). So, it now lists "and", "or", "let" etc. Idea by Julia Kelly.
Getting Started
This example is demonstrating the Pyjama Scheme source code under Linux (specifically, Fedora9). Fedora9 uses mono version 1.9.1. See PyjamaDevelopment for more detailed information.
On Fedora Linux, you'll need the following packages:
mono-core mono-basic indent make svn
You will also need a base Scheme. We use Petite Scheme, the free version of Chez Scheme.
su - wget http://www.scheme.com/download/PetiteChezScheme-7.4-1.i386.rpm rpm -ihv PetiteChezScheme-7.4-1.i386.rpm
Finally, use svn to checkout the Scheme directory:
svn checkout http://svn.cs.brynmawr.edu/Myro/trunk/Pyjama/modules/Scheme cd Scheme make pjscheme
Will give you a Scheme prompt ==> .
==> (dir) (- * / + < = > and append apply caaaar caaadr caaar caadar caaddr caadr caar cadaar cadadr cadar caddar cadddr caddr cadr call/cc call-with-current-continuation car case cdaaar cdaadr cdaar cdadar cdaddr cdadr cdar cddaar cddadr cddar cdddar cddddr cdddr cddr cdr cond cons current-time debug dir display env eq? equal? eval exit float for-each format get group import int length let let* letrec list list? list->vector list-head list-tail load map member memq newline not null? or parse print printf procedure? property range record-case reverse set-car! set-cdr! sort sqrt string<? string->list string->symbol symbol symbol->string typeof using vector) ==> (exit)
To make changes, you need only edit the Scheme programs ending in -cps.ss. Running make pjscheme will rebuild the system.
Under the Hood
Pyjama Scheme is written in Scheme in these files:
- environments-cps.ss - environment
- interpreter-cps.ss - interpreter
- parser-cps.ss - parser
- reader-cps.ss - scanner/lexer
- unifier-cps.ss - pattern matching (for define-datatype)
These are written in Continuation-Passing Style (CPS).
These files are transformed into C# through a series of stages:
- grouped into a single file, pjscheme-cps.ss
- converted from continuation-passing style TO data structures, pjscheme-ds.ss
- converted from data structures TO register machine, pjscheme-rm.ss
- register machine TO C# code, pjscheme.cs
So pjscheme-cps.ss is converted to pjscheme-ds.ss, then pjscheme-ds.ss is converted to pjscheme-rm.ss, and finally pjscheme-rm.ss is converted to pjscheme.cs.
Support for low-level Scheme functions are defined in Scheme.cs.
Testing Scheme in Scheme
As we are writing Scheme in Scheme, you might want to first test your changes by running the Scheme directly in Petite. To do that:
$ petite interpreter-cps.ss > (start) ===>
That is, run petite interpreter-cps.ss at the shell prompt, and then (start) at the petite prompt. You'll get the PJ prompt, with PJ running in Petite.
pjscheme.cs
Scheme commands, such as string->list, are primitives, so there isn't any Scheme code that implements it... it is at a lower level. Petite Scheme is probably written in C, so string->list is written in C. So, that means we have to write functions like this manually in C#.
You'll find all of the Pyjama Scheme primitives written so far in Scheme.cs. But, C# can't have hyphens and greater-thans in symbol names. So, there are some common character combinations that are automatically replaced when going from Scheme to C#:
- -> becomes _to_
- - becomes _
- ? becomes _q
- ! becomes _b
So, string->list is a primitive defined in Scheme.cs as string_to_list.
In the Scheme code, k (or k2) represents a continuation, and handler is an error handler (for taking care of exceptions and errors).
There is a very basic trace mechanism in pjscheme.cs. You can set the level of detail with:
==> (debug N)
where N is a number between 0 and 10, inclusive. You can also see the debug level with:
==> (debug) 0
Zero means off (no tracing) and 10 is maximum tracing. The indent level shows the level of trace, which is approximately the low-levelness of the code. We should come up with a better means than this!
Advanced Issues
If you are writing code in CPS fashion, then you should use "define*" to define the function. Otherwise, it is just a regular function, and just use "define".
If you are using a lambda for something other than a function definition, use:
- lambda-cont: indicates it is a continuation that receives a single value
- lambda-cont2: indicates it is a continuation that receives 2 values
- lambda-macro: indicates that this is a macro
- lambda-handler: indicates that this is an error handler
If you are writing code that maps a new function (one that you have defined in Scheme) you need to add create what is called a "delegate". This is a bit of a hassle for now. If you had a new function named, say, fixstring, and you were mapping that to a list, you need to add:
public static Proc fixstring_proc =
new Proc(\"fixstring\", (Procedure1) fixstring, 1, 1);
to the string variable named *code* in the file scheme-to-csharp.ss. The first name should be the name of the function with "_proc" appended. Thus: "fixstring_proc".
NOTE: This is only required for support code (e.g., code that implements the language). It is not necessary for functions in Pyjama Scheme.
Example Extensions
The following examples show you what is necessary to make changes to Pyjama Scheme. After any changes, you should be able to:
make pjscheme
and it should rebuild. If something doesn't seem to rebuild correctly, you can issue a make clean followed by make pjscheme.
To make sure you have the latest, up to date code, do this in the Scheme directory:
svn update make clean
You may want to do that each session before beginning work.
To make a "patch" to submit, do this:
svn diff > descriptive-name.patch
The "diff" will put all of the differences between the current version and yours (ie, your changes). Someone can then apply your changes to the server's code.
To apply a patch to your own code:
patch -p0 < descriptive-name.patch
That will make all of the changes in the patch file to your copy of files in the directory.
Example Extension: Vector
Example extension #1: Adding a new function to Pyjama Scheme
The first example is adding a new function to the Pyjama Scheme environment in interpreter-cps.ss.
- Add the keyword to the keywords
- Add the function code to list of functions
For this example, let's add a new function vector that takes a number of arguments and puts them into a vector (eg, an array). It will work like:
==> (vector 'a 'b 'c) (vector a b c)
This is not a special form (doesn't have special syntax) but is just a new function that does something to a list of evaluated arguments. The first step is to add the keyword vector to the list of words in the environment. This is done in the interpreter-cps.ss file, near line 255:
(make-initial-environment
(list 'exit 'eval 'parse 'apply 'sqrt 'print 'display 'newline 'load 'null? 'cons 'car 'cdr
'list '+ '- '* '/ '< '> '= 'equal? 'eq? 'memq 'range 'set-car! 'set-cdr!
'import 'get 'call-with-current-continuation 'call/cc
'reverse 'append 'list->vector 'dir 'current-time 'map 'for-each 'env
'using 'not 'printf 'vector)
Next, we add the function to the list of functions. Again, in interpreter-cps.ss near line 331:
...
;; printf
(lambda-proc (args env2 handler k2) (begin (apply printf-prim args) (k2 '<void>)))
;; vector
(lambda-proc (args env2 handler k2) (k2 (apply make-vector args)))
...
NOTE: the order of words added in step 1 must match the order of functions here!
In this example, we need merely call make-vector on the arguments and pass it to the continuation (k2).
If you are defining functions that are used by the system, then you need one last step. You will need to add those to scheme-to-csharp.cs in the *code* list, like:
public static Proc make_vector_proc = new Proc(\"list->vector\", (Procedure1) make_vector, 1, 1);
If you didn't define these explicitly (perhaps they are low-level scheme functions) then you need to implement them. You can write them in C# in Scheme.cs, or write them in Scheme in interpreter-cps.ss and let them get translated. If they are low-level Scheme functions, such as car, you'll need to write them in C#.
Above, the Proc() constructor takes three parameters: the name of the function (this isn't used, except to give feedback to the user in times of error or tracing), the name of the C# function to call, the type of arguments it takes, and the type of return value.
Arguments it takes:
- -1, means variable length list (one arg, interpreted as a list)
- 1, means exactly one (interpreted as an object)
- 2, means exactly two (interpreted as objects)
- 3, means exactly three (interpreted as objects)
What it returns:
- 0, means return void (nothing returned)
- 1, means return object
- 2, means return boolean
Finally, the cast must match the incoming parameter count and type:
- Procedure0, takes no arguments, returns object
- Procedure0Bool, takes no arguments, returns boolean
- Procedure1, if it takes one argument (variable or not), returns Object
- Procedure1Bool, takes one argument, returns boolean
- etc. up to 3
Trouble Shooting
I have added a function in the -cps Scheme code, but when I try to "make pjscheme" I get the error that "not all code paths return a value". What am I doing wrong?
This can happen when you add a function such as:
(define help
(lambda (args env handler k)
(lookup-value args env handler k)))
to interpreter-cps.ss. This produces the following C# code:
new public static object help (object args, object env, object handler, object k2) {
k_reg = k;
handler_reg = handler;
env_reg = env;
variable_reg = args;
pc = (Function) lookup_value;
}
If you look closely at the C# function, it doesn't return anything. In fact, the code is just sets some values, as that is what the transformations do: they unroll the recursions and function calls so there is no stack, and no return values. All that help does is call lookup_value. The pc (program counter) will call that function, which gets its arguments from the registers set in here in help.
However, the "public static object help" says that you will return an "object". This is a guess that the transformation makes, but it is wrong. To fix this, you need to manually tell the Scheme-to-CSharp system that the function help does not return anything. You can do that by adding an entry in the list *system-function-signatures* in the file scheme-to-csharp.ss. Like so:
... (help "void" ()) ...
This says: "use void as the return type for the function help." Void means that this doesn't return anything. This signature doesn't specify the types of the four arguments to pass in to help (which would be given in the list after "void") because they are all "object", which is the default.
Future Work
Ideas for projects:
- Make Pyjama Scheme behave more like Python:
- add iterators and support
- Try Pyjama Scheme command (range 10)
- add ways to make other items iterators
- Try Pyjama Scheme command (for-each function list) ie, (for-each display '(1 2 3)).
- add iterators and support
- Hook into the Dynamic Language Runtime (DLR)
- use the DLR environments
- use it to load DLLs
- see the support for modules in Pyjama Scheme (above)
- Add better debugging support
- scheme code should know file, line, col of where they are defined
- scheme functions should know their name (if they have one)
- Add a help system
- Add a step function
- Add some Scheme primitives
- what's missing?
References
Pyjama
- PyjamaDevelopment - plans and details for Pyjama development
Dynamic Language Runtime - DLR
- http://compilerlab.members.winisp.net/dlr-spec-hosting.pdf - DLR Hosting Specification
- http://blogs.msdn.com/mmaly/archive/2008/01/14/building-a-dlr-language-trees.aspx - Overview of ToyScript
- http://blogs.msdn.com/mmaly/archive/tags/Dynamic+Language+Runtime/default.aspx - Aspects of the DLR (read from bottom to top)
Abstract Syntax Tree - AST
- http://www.bitwisemag.com/2/DLR-Build-Your-Own-Language
- http://devhawk.net/2008/01/29/Practical+F+Parsing+Recursion+And+Predicate+Functions.aspx
- http://www.knowing.net/PermaLink,guid,526e4ccd-39e8-4932-b9a4-d15e6c82bfbf.aspx
- http://blogs.gotdotnet.com/mmaly/archive/2008/01/14/building-a-dlr-language-trees.aspx
