Table of Contents
Defines a new set based on a list of tags, or appends to an existing set.
Composite tags in ()
require that all tags match.
LIST
cannot perform set operations - all elements of a LIST
definition is parsed as literal tags, not other sets.
LIST setname = tag othertag (mtag htag) ltag ; LIST setname += even more tags ;
If the named set for +=
is of SET-type, then the new tags will be in a set OR'ed onto the existing one.
See set manipulation.
Avoid cluttering your grammar with LIST N = N;
definitions by using LIST-TAGS or STRICT-TAGS instead.
Defines a new set based on operations between existing sets.
To include literal tags or composite tags in operations, define an inline set with ()
.
SET setname = someset + someotherset - (tag) ;
Equivalent to the mathematical set union ∪ operator.
LIST a = a b c d ; LIST b = c d e f ; # Logically yields a set containing tags: a b c d e f # Practically a reading must match either set SET r = a OR b ; SET r = a | b ;
Equivalent to the SQL Except operator.
LIST a = a b c d ; LIST b = c d e f ; # Logically yields a set containing tags: a b !c !d !e !f # Practically a reading must match the first set and must not match the second set SET r = a - b ;
Equivalent to the mathematical set complement ∖ operator. The symbol is a normal backslash.
LIST a = a b c d ; LIST b = c d e f ; # Logically yields a set containing tags: a b SET r = a \ b ;
Equivalent to the mathematical set symmetric difference ∆ operator. The symbol is the Unicode code point U+2206.
LIST a = a b c d ; LIST b = c d e f ; # Logically yields a set containing tags: a b e f SET r = a ∆ b ;
Equivalent to the mathematical set intersection ∩ operator. The symbol is the Unicode code point U+2229.
LIST a = a b c d ; LIST b = c d e f ; # Logically yields a set containing tags: c d SET r = a ∩ b ;
Equivalent to the mathematical set cartesian product × operator.
LIST a = a b c d ; LIST b = c d e f ; # Logically yields a set containing tags: (a c) (b c) c (d c) (a d) (b d) d (a e) # (b e) (c e) (d e) (a f) (b f) (c f) (d f) # Practically a reading must match both sets SET r = a + b ;
On its own, this is equivalent to set difference -. But, when followed by other sets it becomes a blocker.
In A - B OR C + D
either A - B
or C + D
may suffice for a match.
However, in A ^ B OR C + D
, if B matches then it blocks the rest and fails the entire set match
without considering C or D.
A set containing the (*) tag becomes a magic "any" set and will always match. This saves having to declare a dummy set containing all imaginable tags. Useful for testing whether a cohort exists at a position, without needing details about it. Can also be used to match everything except a few tags with the set operator -.
(*-1 (*) LINK 1* SomeSet) SELECT (*) - NotTheseTags ;
The magic set _S_DELIMITERS_ is created from the DELIMITERS definition. This saves having to declare and maintain a seperate set for matching delimiters in tests.
SET SomeSet = OtherSet OR _S_DELIMITERS_ ;
The magic set _S_SOFT_DELIMITERS_ is created from the SOFT-DELIMITERS definition.
(**1 _S_SOFT_DELIMITERS_ BARRIER BoogieSet)
A magic set containing the single tag (_TARGET_). This set and tag will only match when the currently active cohort is the target of the rule.
A magic set containing the single tag (_MARK_). This set and tag will only match when the currently active cohort is the mark set with X, or if no such mark is set it will only match the target of the rule.
A magic set containing the single tag (_ATTACHTO_). This set and tag will only match when the currently active cohort is the mark set with A.
UNDEF-SETS
lets you undefine/unlink sets so later definitions can reuse the name.
This does not delete a set, nor can it alter past uses of a set. Prior uses of a set remain linked to the old set.
LIST ADV = ADV ; LIST VFIN = (V FIN) ; UNDEF-SETS = VINF ADV ; SET ADV = A OR D OR V ; LIST VFIN = VFIN ;
LIST
with +=
lets you append tags to an existing LIST or SET.
This does not alter past uses of a set. Prior uses of a set remain linked to the old definition.
For LIST
-type sets, this creates a new set that is a combination of all tags from the existing set plus all the new tags.
For SET
-type sets, the new tags are OR'ed onto the existing set. This can lead to surprising behavior if the existing set is complex.
LIST VFIN = (V FIN) ; LIST VFIN += VFIN ;
Each time a rule is run on a reading, the tag that first satisfied the set must be the same as all subsequent matches of the same set in tests.
A set is marked as a tag unification set by prefixing $$ to the name when used in a rule. You can only prefix existing sets; inline sets in the form of $$(tag tags) will not work, but $$Set + $$OtherSet will; that method will make 2 unification sets, though.
The regex tags <.*>r ".*"r "<.*>"r are special and will unify to the same exact tag of that type. This is useful for e.g. mandating that the baseform must be exactly the same in all places.
For example
LIST ROLE = <human> <anim> <inanim> (<bench> <table>) ; SELECT $$ROLE (-1 KC) (-2C $$ROLE) ;
which would logically be the same as
SELECT (<human>) (-1 KC) (-2C (<human>)) ; SELECT (<anim>) (-1 KC) (-2C (<anim>)) ; SELECT (<inanim>) (-1 KC) (-2C (<inanim>)) ; SELECT (<bench> <table>) (-1 KC) (-2C (<bench> <table>)) ;
Caveat: The exploded form is not identical to the unified form. Unification rules are run as normal rules, meaning once per reading. The exploded form would be run in-order as seperate rules per reading. There may be side effects due to that.
Caveat 2: The behavior of this next rule is undefined:
SELECT (tag) IF (0 $$UNISET) (-2* $$UNISET) (1** $$UNISET) ;
Since the order of tests is dynamic, the unification of $$UNISET will be initialized with essentially random data, and as such cannot be guaranteed to unify properly. Well defined behavior can be enforced in various ways:
# Put $$UNISET in the target SELECT (tag) + $$UNISET IF (-2* $$UNISET) (1** $$UNISET) ; # Only refer to $$UNISET in a single linked chain of tests SELECT (tag) IF (0 $$UNISET LINK -2* $$UNISET LINK 1** $$UNISET) ; # Use rule option KEEPORDER SELECT KEEPORDER (tag) IF (0 $$UNISET) (-2* $$UNISET) (1** $$UNISET) ;
Having the unifier in the target is usually the best way to enforce behavior.
Each time a rule is run on a reading, the top-level set that first satisfied the match must be the same as all subsequent matches of the same set in tests.
A set is marked as a top-level set unification set by prefixing && to the name when used in a rule. You can only prefix existing sets; inline sets in the form of &&(tag tags) will not work, but &&Set + &&OtherSet will; that method will make 2 unification sets, though.
For example
LIST SEM-HUM = <human> <person> <sapien> ; LIST SEM-ANIM = <animal> <beast> <draconic> ; LIST SEM-INSECT = <insect> <buzzers> ; SET SEM-SMARTBUG = SEM-INSECT + (<sapien>) ; SET SAME-SEM = SEM-HUM OR SEM-ANIM + SEM-SMARTBUG ; # During unification, OR and + are ignored SELECT &&SAME-SEM (-1 KC) (-2C &&SAME-SEM) ;
which would logically be the same as
SELECT SEM-HUM (-1 KC) (-2C SEM-HUM) ; SELECT SEM-ANIM (-1 KC) (-2C SEM-ANIM) ; SELECT SEM-SMARTBUG (-1 KC) (-2C SEM-SMARTBUG) ;
Note that the unification only happens on the first level of sets, hence named top-level unification. Note also that the set operators in the prefixed set are ignored during unification.
You can use the same set for different unified matches by prefixing the set name with a number and colon.
E.g., &&SAME-SEM
is a different match than &&1:SAME-SEM
.
The same caveats as for Tag Unification apply.