Table of Contents
CG-3 can work with dependency trees in various ways. The input cohorts can have existing dependencies; the grammar can create new attachments; or a combination of the two.
[wordform] SETPARENT <target> [contextual_tests] TO|FROM <contextual_target> [contextual_tests] ;
Attaches the matching reading to the contextually targetted cohort as a child. The last link of the contextual test is used as target.
If the contextual target is a scanning test and the first found candidate cannot be attached due to loop prevention, SETPARENT will look onwards for the next candidate. This can be controlled with rule option NEAREST and ALLOWLOOP.
SETPARENT targetset (-1* ("someword")) TO (1* (step) LINK 1* (candidate)) (2 SomeSet) ;
[wordform] SETCHILD <target> [contextual_tests] TO|FROM <contextual_target> [contextual_tests] ;
Attaches the matching reading to the contextually targetted cohort as the parent. The last link of the contextual test is used as target.
If the contextual target is a scanning test and the first found candidate cannot be attached due to loop prevention, SETCHILD will look onwards for the next candidate. This can be controlled with rule option NEAREST and ALLOWLOOP.
SETCHILD targetset (-1* ("someword")) TO (1* (step) LINK 1* (candidate)) (2 SomeSet) ;
Dependency attachments in input comes in the form of #X->Y or #X→Y tags where X is the number of the current node and Y is the number of the parent node. The X must be unique positive integers and should be sequentially enumerated. '0' is reserved and means the root of the tree, so no node may claim to be '0', but nodes may attach to '0'.
If the Y of a reading cannot be located, it will be reattached to itself. If a reading contains more than one attachment, only the last will be honored. If a cohort has conflicting attachments in its readings, the result is undefined.
For example:
"<There>" "there" <*> ADV @F-SUBJ #1->0 "<once>" "once" ADV @ADVL #2->0 "<was>" "be" <SVC/N> <SVC/A> V PAST SG1/3 VFIN IMP @FMV #3->2 "<a>" "a" <Indef> ART DET CENTRAL SG @>N #4->5 "<man>" "man" N NOM SG @SC #5->0 "<$.>"
Cmdline flag -D or --dep-delimit will enable the use of dependency information to delimit windows. Enabling this will disable DELIMITERS entirely, but will not affect the behavior of SOFT-DELIMITERS nor the hard/soft cohort limits.
Windows are delimited if a cohort has a node number less than or equal to the highest previously seen node number, and also if a cohort has a node number that seems like a discontinuous jump up in numbers. The discontinuous limit is by default 10 but you can pass a number to -D/--dep-delimit to set it yourself. Some systems do not output dependency numbers for punctuation, so setting it too low may break those; the default 10 was chosen since it is unlikely any real text would have 10 sequential cohorts not part of the tree.
For example: #4→5 followed by #3→4 will delimit. #4→5 followed by #4→4 will delimit. #4→5 followed by #15→4 will delimit. #4→5 followed by #5→3 will not delimit.
It is also possible to create or modify the tree on-the-fly with rules. See SETPARENT and SETCHILD. Dependencies created in this fashion will be output in the same format as above.
For example:
SETPARENT (@>N) (0 (ART DET)) TO (1* (N)) ; SETPARENT (@<P) TO (-1* (PRP)) (NEGATE 1* (V)) ;
Either case, once you have a dependency tree to work with, you can use that in subsequent contextual tests as seen below. These positions can be combined with the window spanning options.
The 'pp' position asks for an ancestor of the current position, where ancestor is defined as any parent, grand-parent, great-grand-parent, etc...
(-1* (N) LINK pp (ADJ))
The analogue of difference between cc and c* applies to pp vs. p*
The 'cc' position asks for a descendent of the current position, where descendent is defined as any child, grand-child, great-grand-child, etc...
(-1* (N) LINK cc (ADJ))
The 'S' option allows the test to look at the current target as well. Used in conjunction with p, c, cc, s, or r to test self and the relations.
Be aware that BARRIER and CBARRIER will check and thus possibly stop at the current target when 'S' is in effect. This can be toggled on a per context basis with modifier N.
(cS (ADJ))
The 'N' option causes barriers to ignore the self cohort. If self-no-barrier is enabled, then instead it forces barriers to respect the self cohort.
The '*' option behaves differently when dealing with dependencies. Here, the '*' option allows the test to perform a deep scan. Used in conjunction with p, c, or s to continue until there are no deeper relations. For example, position 'c*' tests the children, grand-children, great-grand-children, and so forth.
(c* (ADJ))
The 'll' option limits the search to the leftmost cohort of the possible matches. Note that this cohort may be to the right of the current target; use 'lll' if you want the leftmost of the cohorts to the left of the current target, or 'llr' if you want the leftmost of the cohorts to the right of the current target.
(llc (ADJ))
The 'rr' option limits the search to the rightmost cohort of the possible matches. Note that this cohort may be to the left of the current target; use 'rrr' if you want the rightmost of the cohorts to the right of the current target, or 'rrl' if you want the rightmost of the cohorts to the left of the current target.
(rrc (ADJ))