Constraint Grammar Manual

3rd version of the CG formalism variant

Mr. Tino Didriksen

GrammarSoft ApS


            
          


Table of Contents

1. Intro
Caveat Emptor
What this is...
Naming
Unicode
Dicussion, Mailing Lists, Bug Reports, etc...
2. License
GNU General Public License
Open Source Exception
Commercial/Proprietary License
3. Installation & Updating
CMake Notes
Ubuntu / Debian
Fedora / Red Hat / CentOS / OpenSUSE
Mac OS X
Homebrew
MacPorts
Other
Windows
Installing ICU
Getting & Compiling VISL CG-3
Updating VISL CG-3
Regression Testing
Cygwin
4. Contributing & Subversion Access
5. Compatibility and Incompatibilities
Gotcha's
Magic Readings
NOT and NEGATE
PREFERRED-TARGETS
Default Codepage / Encoding
Set Operator -
Scanning Past Point of Origin
>>> and <<<
Rule Application Order
Endless Loops
Visibility of Mapped Tags
Contextual position 'cc' vs. 'c*'
Incompatibilites
Mappings
Baseforms & Mixed Input
6. Command Line Reference
Order of argument sources
vislcg3
cg-conv
cg-comp
cg-proc
cg-strictify
cg3-autobin.pl
7. Input/Output Stream Format
Apertium Format
HFST/XFST Format
VISL CG Format
Niceline CG Format
Plain Text Format
8. Grammar
REOPEN-MAPPINGS
CMDARGS, CMDARGS-OVERRIDE
OPTIONS
safe-setparent
addcohort-attach
no-inline-sets
no-inline-templates
strict-wordforms
strict-baseforms
strict-secondary
strict-regex
strict-icase
self-no-barrier
INCLUDE
Sections
BEFORE-SECTIONS
SECTION
AFTER-SECTIONS
NULL-SECTION
Ordering of sections in grammar
--sections with ranges
9. Rules
Cheat Sheet
ADD
COPY
DELIMIT
EXTERNAL
ADDCOHORT
COPYCOHORT
REMCOHORT
SPLITCOHORT
MERGECOHORTS
MOVE, SWITCH
REPLACE
APPEND
SUBSTITUTE
SETVARIABLE
REMVARIABLE
MAP
UNMAP
PROTECT
UNPROTECT
SELECT
REMOVE
IFF
RESTORE
Tag Lists Can Be Sets
Named Rules
Flow Control: JUMP, ANCHOR
WITH
Rule Options
NEAREST
ALLOWLOOP
ALLOWCROSS
DELAYED
IMMEDIATE
IGNORED
LOOKDELAYED
LOOKIGNORED
LOOKDELETED
UNMAPLAST
UNSAFE
SAFE
REMEMBERX
RESETX
KEEPORDER
VARYORDER
ENCL_INNER
ENCL_OUTER
ENCL_FINAL
ENCL_ANY
WITHCHILD
NOCHILD
ITERATE
NOITERATE
REVERSE
SUB:N
OUTPUT
REPEAT
NOMAPPED
NOPARENT
10. Contextual Tests
Position Element Order
1, 2, 3, etc
@
C
NOT
NEGATE
Scanning
BARRIER
CBARRIER
Spanning Window Boundaries
Span Both
Span Left
Span Right
X Marks the Spot
Set Mark
Jump to Mark
Attach To / Affect Instead
Merge With
Jump to Cohort
Test Deleted/Delayed Readings
Look at Deleted Readings
Look at Delayed Readings
Look at Ignored Readings
Scanning Past Point of Origin
--no-pass-origin, -o
No Pass Origin
Pass Origin
Nearest Neighbor
Active/Inactive Readings
Bag of Tags
Optional Frequencies
Dependencies
Relations
11. Parenthesis Enclosures
Example
Contextual Position L
Contextual Position R
Magic Tag _LEFT_
Magic Tag _RIGHT_
Magic Tag _ENCL_
Magic Set _LEFT_
Magic Set _RIGHT_
Magic Set _ENCL_
Magic Set _PAREN_
12. Making use of Dependencies
SETPARENT
SETCHILD
Existing Trees in Input
Using Dependency as Delimiters
Creating Trees from Grammar
Contextual Tests
Parent
Ancestors
Children
Descendents
Siblings
Self
No Barrier
Deep Scan
Left of
Right of
Leftmost
Rightmost
All Scan
None Scan
13. Making use of Relations
ADDRELATION, ADDRELATIONS
SETRELATION, SETRELATIONS
REMRELATION, REMRELATIONS
Existing Relations in Input
Contextual Tests
Specific Relation
Any Relation
Self
Left/right of, Left/rightmost
All Scan
None Scan
14. Making use of Probabilistic / Statistic Input
15. Templates
Position Override
16. Sets
Defining Sets
LIST
SET
Set Operators
Union: OR and |
Except: -
Difference: \
Symmetric Difference: ∆
Intersection: ∩
Cartesian Product: +
Fail-Fast: ^
Magic Sets
(*)
_S_DELIMITERS_
_S_SOFT_DELIMITERS_
Magic Set _TARGET_
Magic Set _MARK_
Magic Set _ATTACHTO_
Magic Set _SAME_BASIC_
Set Manipulation
Undefining Sets
Appending to Sets
Unification
Tag Unification
Top-Level Set Unification
17. Tags
Tag Order
Literal String Modifiers
Regular Expressions
Line Matching
Variable Strings
Numerical Matches
Stream Metadata
Stream Static Tags
Global Variables
Local Variables
Fail-Fast Tag
STRICT-TAGS
LIST-TAGS
18. Sub-Readings
Apertium Format
CG Format
Grammar Syntax
Rule Option SUB:N
Contextual Option /N
19. Profiling / Code Coverage
What and why
Gathering profiling data
Annotating
20. Binary Grammars
Security of Binary vs. Textual
Loading Speed of Binary Grammars
How to...
Incompatibilities
vislcg / bincg / gencg
--grammar-info, --grammar-out, --profile
21. External Callbacks and Processors
Protocol Datatypes
Protocol Flow
22. Input Stream Commands
Exit
Flush
Ignore
Resume
Set Variable
Unset Variable
23. FAQ & Tips & Tricks
FAQ
How far will a (*-1C A) test scan?
How can I match the tag * from my input?
Tricks
Determining whether a cohort has (un)ambiguous base forms
Attach all cohorts without a parent to the root
Use multiple cohorts as a barrier
Add a delimiting cohort
24. Constraint Grammar Glossary
Baseform
Cohort
Contextual Target
Contextual Test
Dependency
Disambiguation Window
Mapping Tag
Mapping Prefix
Reading
Rule
Set
Tag
Wordform
25. Constraint Grammar Keywords
ADD
ADDCOHORT
ADDRELATION
ADDRELATIONS
AFTER-SECTIONS
ALL
AND
APPEND
BARRIER
BEFORE-SECTIONS
CBARRIER
CONSTRAINTS
COPY
CORRECTIONS
DELIMIT
DELIMITERS
END
EXTERNAL
IF
IFF
INCLUDE
LINK
LIST
MAP
MAPPINGS
MAPPING-PREFIX
MOVE
NEGATE
NONE
NOT
NULL-SECTION
OPTIONS
PREFERRED-TARGETS
REMCOHORT
REMOVE
REMRELATION
REMRELATIONS
REPLACE
SECTION
SELECT
SET
SETCHILD
SETPARENT
SETRELATION
SETRELATIONS
SETS
SOFT-DELIMITERS
STATIC-SETS
STRICT-TAGS
SUBSTITUTE
SWITCH
TARGET
TEMPLATE
TEXT-DELIMITERS
TO
UNDEF-SETS
UNMAP
26. Drafting Board
MATCH
EXECUTE
References
Index

List of Tables

17.1. Valid Operators
17.2. Comparison Truth Table