schrodinger.application.desmond.antlr3.treewizard module

@package antlr3.tree @brief ANTLR3 runtime package, treewizard module

A utility module to create ASTs at runtime. See <http://www.antlr.org/wiki/display/~admin/2007/07/02/Exploring+Concept+of+TreeWizard> for an overview. Note that the API of the Python implementation is slightly different.

schrodinger.application.desmond.antlr3.treewizard.computeTokenTypes(tokenNames)

Compute a dict that is an inverted index of tokenNames (which maps int token types to names).

class schrodinger.application.desmond.antlr3.treewizard.TreePatternLexer(pattern)

Bases: object

__init__(pattern)
nextToken()
consume()
class schrodinger.application.desmond.antlr3.treewizard.TreePatternParser(tokenizer, wizard, adaptor)

Bases: object

__init__(tokenizer, wizard, adaptor)
pattern()
parseTree()
parseNode()
class schrodinger.application.desmond.antlr3.treewizard.TreePattern(payload)

Bases: schrodinger.application.desmond.antlr3.tree.CommonTree

When using %label:TOKENNAME in a tree for parse(), we must track the label.

__init__(payload)

Create a new node from an existing node does nothing for BaseTree as there are no fields other than the children list, which cannot be copied as the children are not considered part of this node.

toString()

Override to say how a node (not a tree) should look as text

addChild(childTree)

Add t as child of this node.

Warning: if t has no children, but child does and child isNil then this routine moves children to t via t.children = child.children; i.e., without copying the array.

addChildren(children)

Add all elements of kids list as children of this node

property charPositionInLine
deleteChild(i)
dupNode()
freshenParentAndChildIndexes(offset=0)

Set the parent and child index values for all children

getAncestor(ttype)

Walk upwards and get first ancestor with this token type.

getAncestors()

Return a list of all ancestors of this node.

The first node of list is the root and the last is the parent of this node.

getCharPositionInLine()
getChild(i)
getChildCount()
getChildIndex()

BaseTree doesn’t track child indexes.

getChildren()

@brief Get the children internal List

Note that if you directly mess with the list, do so at your own risk.

getFirstChildWithType(treeType)
getLine()

In case we don’t have a token payload, what is the line for errors?

getParent()

BaseTree doesn’t track parent pointers.

getText()
getToken()
getTokenStartIndex()
What is the smallest token index (indexing from 0) for this node

and its children?

getTokenStopIndex()

What is the largest token index (indexing from 0) for this node and its children?

getType()

Return a token type; needed for tree parsing.

hasAncestor(ttype)

Walk upwards looking for ancestor with this token type.

isNil()

Indicates the node is a nil node but may still have children, meaning the tree is a flat list.

property line
replaceChildren(startChildIndex, stopChildIndex, newTree)

Delete children from start to stop and replace with t even if t is a list (nil-root tree). num of children can increase or decrease. For huge child lists, inserting children can force walking rest of children to set their childindex; could be slow.

sanityCheckParentAndChildIndexes(parent=None, i=- 1)
setChild(i, t)

Set ith child (0..n-1) to t; t must be non-null and non-nil node

setChildIndex(idx)

BaseTree doesn’t track child indexes.

setParent(t)

BaseTree doesn’t track parent pointers.

setTokenStartIndex(index)
setTokenStopIndex(index)
setUnknownTokenBoundaries()

For every node in this subtree, make sure it’s start/stop token’s are set. Walk depth first, visit bottom up. Only updates nodes with at least one token index < 0.

property text
toStringTree()

Print out a whole tree not just a node

property tokenStartIndex
property tokenStopIndex
property type
class schrodinger.application.desmond.antlr3.treewizard.WildcardTreePattern(payload)

Bases: schrodinger.application.desmond.antlr3.treewizard.TreePattern

__init__(payload)

Create a new node from an existing node does nothing for BaseTree as there are no fields other than the children list, which cannot be copied as the children are not considered part of this node.

addChild(childTree)

Add t as child of this node.

Warning: if t has no children, but child does and child isNil then this routine moves children to t via t.children = child.children; i.e., without copying the array.

addChildren(children)

Add all elements of kids list as children of this node

property charPositionInLine
deleteChild(i)
dupNode()
freshenParentAndChildIndexes(offset=0)

Set the parent and child index values for all children

getAncestor(ttype)

Walk upwards and get first ancestor with this token type.

getAncestors()

Return a list of all ancestors of this node.

The first node of list is the root and the last is the parent of this node.

getCharPositionInLine()
getChild(i)
getChildCount()
getChildIndex()

BaseTree doesn’t track child indexes.

getChildren()

@brief Get the children internal List

Note that if you directly mess with the list, do so at your own risk.

getFirstChildWithType(treeType)
getLine()

In case we don’t have a token payload, what is the line for errors?

getParent()

BaseTree doesn’t track parent pointers.

getText()
getToken()
getTokenStartIndex()
What is the smallest token index (indexing from 0) for this node

and its children?

getTokenStopIndex()

What is the largest token index (indexing from 0) for this node and its children?

getType()

Return a token type; needed for tree parsing.

hasAncestor(ttype)

Walk upwards looking for ancestor with this token type.

isNil()

Indicates the node is a nil node but may still have children, meaning the tree is a flat list.

property line
replaceChildren(startChildIndex, stopChildIndex, newTree)

Delete children from start to stop and replace with t even if t is a list (nil-root tree). num of children can increase or decrease. For huge child lists, inserting children can force walking rest of children to set their childindex; could be slow.

sanityCheckParentAndChildIndexes(parent=None, i=- 1)
setChild(i, t)

Set ith child (0..n-1) to t; t must be non-null and non-nil node

setChildIndex(idx)

BaseTree doesn’t track child indexes.

setParent(t)

BaseTree doesn’t track parent pointers.

setTokenStartIndex(index)
setTokenStopIndex(index)
setUnknownTokenBoundaries()

For every node in this subtree, make sure it’s start/stop token’s are set. Walk depth first, visit bottom up. Only updates nodes with at least one token index < 0.

property text
toString()

Override to say how a node (not a tree) should look as text

toStringTree()

Print out a whole tree not just a node

property tokenStartIndex
property tokenStopIndex
property type
class schrodinger.application.desmond.antlr3.treewizard.TreePatternTreeAdaptor

Bases: schrodinger.application.desmond.antlr3.tree.CommonTreeAdaptor

This adaptor creates TreePattern objects for use during scan()

createWithPayload(payload)

Create a tree node from Token object; for CommonTree type trees, then the token just becomes the payload. This is the most common create call.

Override if you want another kind of node to be built.

addChild(tree, child)

Add a child to the tree t. If child is a flat tree (a list), make all in list children of t. Warning: if t has no children, but child does and child isNil then you can decide it is ok to move children to t via t.children = child.children; i.e., without copying the array. Just make sure that this is consistent with have the user will build ASTs.

becomeRoot(newRoot, oldRoot)

If oldRoot is a nil root, just copy or move the children to newRoot. If not a nil root, make oldRoot a child of newRoot.

old=^(nil a b c), new=r yields ^(r a b c) old=^(a b c), new=r yields ^(r ^(a b c))

If newRoot is a nil-rooted single child tree, use the single child as the new root node.

old=^(nil a b c), new=^(nil r) yields ^(r a b c) old=^(a b c), new=^(nil r) yields ^(r ^(a b c))

If oldRoot was null, it’s ok, just return newRoot (even if isNil).

old=null, new=r yields r old=null, new=^(nil r) yields ^(nil r)

Return newRoot. Throw an exception if newRoot is not a simple node or nil root with a single child node–it must be a root node. If newRoot is ^(nil x) return x as newRoot.

Be advised that it’s ok for newRoot to point at oldRoot’s children; i.e., you don’t have to copy the list. We are constructing these nodes so we should have this control for efficiency.

create(*args)

Deprecated, use createWithPayload, createFromToken or createFromType.

This method only exists to mimic the Java interface of TreeAdaptor.

createFromToken(tokenType, fromToken, text=None)

Create a new node derived from a token, with a new token type and (optionally) new text.

This is invoked from an imaginary node ref on right side of a rewrite rule as IMAG[$tokenLabel] or IMAG[$tokenLabel “IMAG”].

This should invoke createToken(Token).

createFromType(tokenType, text)

Create a new node derived from a token, with a new token type.

This is invoked from an imaginary node ref on right side of a rewrite rule as IMAG[“IMAG”].

This should invoke createToken(int,String).

createToken(fromToken=None, tokenType=None, text=None)

Tell me how to create a token for use with imaginary token nodes. For example, there is probably no input symbol associated with imaginary token DECL, but you need to create it as a payload or whatever for the DECL node as in ^(DECL type ID).

If you care what the token payload objects’ type is, you should override this method and any other createToken variant.

deleteChild(t, i)

Remove ith child and shift children down from right.

dupNode(treeNode)

Duplicate a node. This is part of the factory; override if you want another kind of node to be built.

I could use reflection to prevent having to override this but reflection is slow.

dupTree(t, parent=None)

This is generic in the sense that it will work with any kind of tree (not just Tree interface). It invokes the adaptor routines not the tree node routines to do the construction.

errorNode(input, start, stop, exc)

create tree node that holds the start and stop tokens associated with an error.

If you specify your own kind of tree nodes, you will likely have to override this method. CommonTree returns Token.INVALID_TOKEN_TYPE if no token payload but you might have to set token type for diff node type.

You don’t have to subclass CommonErrorNode; you will likely need to subclass your own tree node class to avoid class cast exception.

getChild(t, i)

Get a child 0..n-1 node

getChildCount(t)

How many children? If 0, then this is a leaf node

getChildIndex(t)

What index is this node in the child list? Range: 0..n-1 If your node type doesn’t handle this, it’s ok but the tree rewrites in tree parsers need this functionality.

getParent(t)

Who is the parent node of this node; if null, implies node is root. If your node type doesn’t handle this, it’s ok but the tree rewrites in tree parsers need this functionality.

getText(t)
getToken(t)

What is the Token associated with this node? If you are not using CommonTree, then you must override this in your own adaptor.

getTokenStartIndex(t)

Get the token start index for this subtree; return -1 if no such index

getTokenStopIndex(t)

Get the token stop index for this subtree; return -1 if no such index

getType(t)

For tree parsing, I need to know the token type of a node

getUniqueID(node)

For identifying trees.

How to identify nodes so we can say “add node to a prior node”? Even becomeRoot is an issue. Use System.identityHashCode(node) usually.

isNil(tree)

Is tree considered a nil node used to make lists of child nodes?

nil()

Return a nil node (an empty but non-null node) that can hold a list of element as the children. If you want a flat tree (a list) use “t=adaptor.nil(); t.addChild(x); t.addChild(y);”

replaceChildren(parent, startChildIndex, stopChildIndex, t)

Replace from start to stop child index of parent with t, which might be a list. Number of children may be different after this call.

If parent is null, don’t do anything; must be at root of overall tree. Can’t replace whatever points to the parent externally. Do nothing.

rulePostProcessing(root)

Transform ^(nil x) to x and nil to null

setChild(t, i, child)

Set ith child (0..n-1) to t; t must be non-null and non-nil node

setChildIndex(t, index)

What index is this node in the child list? Range: 0..n-1 If your node type doesn’t handle this, it’s ok but the tree rewrites in tree parsers need this functionality.

setParent(t, parent)

Who is the parent node of this node; if null, implies node is root. If your node type doesn’t handle this, it’s ok but the tree rewrites in tree parsers need this functionality.

setText(t, text)

Node constructors can set the text of a node

setTokenBoundaries(t, startToken, stopToken)

Track start/stop token for subtree root created for a rule. Only works with Tree nodes. For rules that match nothing, seems like this will yield start=i and stop=i-1 in a nil node. Might be useful info so I’ll not force to be i..i.

setType(t, type)

Node constructors can set the type of a node

class schrodinger.application.desmond.antlr3.treewizard.TreeWizard(adaptor=None, tokenNames=None, typeMap=None)

Bases: object

Build and navigate trees with this object. Must know about the names of tokens so you have to pass in a map or array of token names (from which this class can build the map). I.e., Token DECL means nothing unless the class can translate it to a token type.

In order to create nodes and navigate, this class needs a TreeAdaptor.

This class can build a token type -> node index for repeated use or for iterating over the various nodes with a particular type.

This class works in conjunction with the TreeAdaptor rather than moving all this functionality into the adaptor. An adaptor helps build and navigate trees using methods. This class helps you do it with string patterns like “(A B C)”. You can create a tree from that pattern or match subtrees against it.

__init__(adaptor=None, tokenNames=None, typeMap=None)
getTokenType(tokenName)

Using the map of token names to token types, return the type.

create(pattern)

Create a tree or node from the indicated tree pattern that closely follows ANTLR tree grammar tree element syntax:

(root child1 … child2).

You can also just pass in a node: ID

Any node can have a text argument: ID[foo] (notice there are no quotes around foo–it’s clear it’s a string).

nil is a special name meaning “give me a nil node”. Useful for making lists: (nil A B C) is a list of A B C.

index(tree)

Walk the entire tree and make a node name to nodes mapping.

For now, use recursion but later nonrecursive version may be more efficient. Returns a dict int -> list where the list is of your AST node type. The int is the token type of the node.

find(tree, what)

Return a list of matching token.

what may either be an integer specifzing the token type to find or a string with a pattern that must be matched.

visit(tree, what, visitor)

Visit every node in tree matching what, invoking the visitor.

If what is a string, it is parsed as a pattern and only matching subtrees will be visited. The implementation uses the root node of the pattern in combination with visit(t, ttype, visitor) so nil-rooted patterns are not allowed. Patterns with wildcard roots are also not allowed.

If what is an integer, it is used as a token type and visit will match all nodes of that type (this is faster than the pattern match). The labels arg of the visitor action method is never set (it’s None) since using a token type rather than a pattern doesn’t let us set a label.

parse(t, pattern, labels=None)

Given a pattern like (ASSIGN %lhs:ID %rhs:.) with optional labels on the various nodes and ‘.’ (dot) as the node/subtree wildcard, return true if the pattern matches and fill the labels Map with the labels pointing at the appropriate nodes. Return false if the pattern is malformed or the tree does not match.

If a node specifies a text arg in pattern, then that must match for that node in t.

equals(t1, t2, adaptor=None)

Compare t1 and t2; return true if token types/text, structure match exactly. The trees are examined in their entirety so that (A B) does not match (A B C) nor (A (B C)).