templates/doc/ql/cqp.html.ep - KorAP/Kalamar - Gitiles

 % layout 'main', title => 'KorAP: CQP';

 %= page_title

 <p>The following documentation introduces all features provided by our
   version of the CQP Query Language and some KorAP specific extensions.
   This tutorial is based on the IMS Open Corpus Workbench (CWB)

   <%= ext_link_to 'CQP Query Language Tutorial, version 3.4.24 (May 2020)',"https://cwb.sourceforge.io/files/CQP_Manual.pdf"  %>
   and on
   <%= embedded_link_to 'doc', 'the Korap Poliqarp+ tutorial', 'ql', 'poliqarp-plus' %>.</p>

 <section id="segments">
   <h3>Simple Segments</h3>
   <p>The atomic elements of CQP queries are segments. Most of the time,
     segments represent words and can be queried by encapsulating them in
     qoutes or double quotes:</p>


   %= doc_query cqp => loc('Q_cqp_simplesquote', "** 'Tree'")

   <p>or</p>

   %= doc_query cqp => loc('Q_cqp_simpledquote', '** "Tree"')

   <p>A word segment is always interpreted as a <%= embedded_link_to 'doc', 'regular expressions', 'ql', 'regexp' %>, e.g., a query like</p>

   %= doc_query cqp => loc('Q_cqp_re', '** "r(u|a)n"'), cutoff => 1

   %# <p>can return both "Tannenbaum" and "baum".</p>

   <p>Sequences of simple segments are expressed using a space delimiter:</p>

   %= doc_query cqp => loc('Q_cqp_simpleseq1', '** "the" "Tree"')

   %= doc_query cqp => loc('Q_cqp_simpleseq2', "** 'the' 'Tree'")

   %# ------------------- Current state (ND)

   <p>Originally, CQP was developped as a corpus query processor tool and
     any CQP command had to be followed by a semicolon. <%= ext_link_to 'The CQPweb server', "https://cqpweb.lancs.ac.uk/" %> at
     Lancaster treats the semicolon as optional, and we implemented it in
     the same way.</p>
   <p>Simple segments always refer to the surface form of a word. To search
     for surface forms without case sensitivity, you can use the <code>%c</code>
     flag:</p>


   %= doc_query cqp => loc('Q_cqp_simplescflag', '"laufen"%c'), cutoff => 0


   <p>The query above will find all occurrences of the term irrespective of
     the capitalization of letters.</p>

   <p>Diacritics is not been supported yet.</p>

   <!-- EM
   <p>To ignore diacritics, you can use the <code>%d</code> flag:</p>


   %= doc_query cqp => loc('Q_cqp_simplesidia2', '"Fraulein"%d'), cutoff => 0


   <p>The query above will find all occurrences of the term irrespective of
     the use of diacritics (i.e., <code>Fräulein</code> and <code>Fraulein</code>).</p>

   <p>Flags can be combined to ignore bose case sensitivity and diacritics:</p>


   %= doc_query cqp => loc('Q_cqp_simplesegidia2', '"Fraulein"%cd'), cutoff => 0

   <p>The query above will find all occurrences of the term irrespective of
     the use of diacritics or of capitalization: <code>fraulein</code>, <code>Fraulein</code>,
     <code>fräulein</code>, etc.</p>
 -->

   <h4 id="regexp">Regular Expressions</h4>
         <p>Special regular expressions characters like <code>.</code>, <code>?</code>,
           <code>*</code>, <code>+</code>, <code>|</code>, <code>(</code>, <code>)</code>,
           <code>[</code>, <code>]</code>, <code>{</code>, <code>}</code>, <code>^</code>,
           <code>$</code> have to be "escaped" with backslash (<code>\</code>):</p>
         <ul>
           <li><code>"?";</code> fails while <code>"\?";</code> returns <code>?.</code></li>
           <li><code>"."</code> returns any character, while <code>"\$\."</code>
             returns <code>$.</code></li>
         </ul>
         <blockquote class="warning">
           <p>Beware: Queries with prepended <code>.*</code> expressions can
             become extremely slow!</p>
           <p>In Poliqarp+ only double quotes are used for regular expressions,
             while single quotes are used to mark verbatim strings. In CQP, you
             can use %l flag to match the string in a verbatim manner.</p>
         </blockquote>
         <p>To match a word form containing single or double quotes, use one of
           the following methods :</p>
         <ul>
           <li>if the string you need to match contain either single or
             double quotes, use the other quote character to encapsulate the
             string: </li>

             %= doc_query cqp => loc('Q_cqp_regexqu1', '"It\'s"'), cutoff => 0

             <!-- EM
             %= doc_query cqp => loc('Q_cqp_xxxx', '\'12"-screen\''), cutoff => 0
             -->

           <li>escape the qoutes by doubling every occurrence of the quotes
             character inside the string: </li>

             %= doc_query cqp => loc('Q_cqp_regexequ1', '\'It\'\'s\''), cutoff => 0

             <!-- %= doc_query cqp => loc('Q_cqp_regexequ2', '"12""-screen"'), cutoff => 0 -->

           <li>escape the qoutes by using <code>(\)</code>: </li>

             %= doc_query cqp => loc('Q_cqp_regexequ3', "'It\\'s'"), cutoff => 0

             <!-- %= doc_query cqp => loc('Q_cqp_regexequ4', '"12\\"-screen"'), cutoff => 0 -->
         </ul>
       </section>
       <section id="complex">
         <h3>Complex Segments</h3>
         <p>Complex segments are expressed in square brackets and contain
           additional information on the resource of the term under scrutiny by
           providing key/value pairs, separated by an equal-sign.</p>
         <p>The KorAP implementation of CQP provides three special segment keys:
           <code>orth</code> for surface forms, <code>base</code> for lemmata,
           and <code>pos</code> for Part-of-Speech. The following complex query
           finds all surface forms of the defined word:</p>

   %= doc_query cqp => loc('Q_cqp_compsl1', '[orth="Baum"]'), cutoff => 0


         <p>The query is thus equivalent to:</p>

   %= doc_query cqp => loc('Q_cqp_compsl2', '"Baum"'), cutoff => 0


         <p>Complex segments expect simple expressions as values, meaning that
           the following expression is valid as well:</p>

   %= doc_query cqp => loc('Q_cqp_compsse', '[orth="l(au|ie)fen"%c]'), cutoff => 1


         <p>Another special key is <code>base</code>, refering to the lemma
           annotation of the <%= embedded_link_to 'doc', 'default foundry', 'data', 'annotation'%>. The following query finds all occurrences of segments
           annotated as a specified lemma by the default foundry:</p>

   %= doc_query cqp => loc('Q_cqp_compsbase', '[base="Baum"]'), cutoff => 1


         <p>The third special key is <code>pos</code>, refering to the
           part-of-speech annotation of the <%= embedded_link_to 'doc', 'default foundry', 'data', 'annotation'%>. The following query finds all attributive adjectives:</p>

   %= doc_query cqp => loc('Q_cqp_compspos', '[pos="ADJA"]'), cutoff => 1


         <p>Complex segments requesting further token annotations can have keys
           following the <code>foundry/layer</code> notation. For example to
           find all occurrences of plural words in a supporting foundry, you can
           search using the following queries:</p>

   %= doc_query cqp => loc('Q_cqp_compstoken1', '[marmot/m="number":"pl"]'), cutoff => 1


   %= doc_query cqp => loc('Q_cqp_compstoken2', "[marmot/m='tense':'pres']"), cutoff => 1


         <p>In case an annotation contains special non-alphabetic and non-numeric
           characters, the annotation part can be followed by <code>%l</code> to
           ensure a verbatim interpretation:</p>

   %= doc_query cqp => loc('Q_cqp_compstokenverb', "[orth='https://de.wikipedia.org'%l]"), cutoff => 1


         <h4>Negation</h4>
         <p>Negation of terms in complex expressions can be expressed by
           prepending the equal sign or the whole expression with an exclamation
           mark.</p>

   %= doc_query cqp => loc('Q_cqp_neg1', '[pos!="ADJA"] "Haare"'), cutoff => 1


   %= doc_query cqp => loc('Q_cqp_neg2', '[!pos="ADJA"] "Haare"'), cutoff => 1


         <blockquote class="warning">
           <p>Beware: Negated complex segments can't be searched as a single
             statement. However, they work in case they are part of a <%= embedded_link_to 'doc', 'sequence', 'ql', 'poliqarp-plus#syntagmatic-operators-sequence'%>.</p>
         </blockquote>
         <h4 id="empty-segments">Empty Segments</h4>
         <p>A special segment is the empty segment, that matches every word in
           the index.</p>

   %= doc_query cqp => loc('Q_cqp_empseq', '[]'), cutoff => 1


         <p>Empty segments are useful to express distances of words by using
            <%= embedded_link_to 'doc', 'repetitions', 'ql', 'poliqarp-plus#syntagmatic-operators-repetitions'%>.</p>
         <blockquote class="warning">
           <p>Beware: Empty segments can't be searched as a single statement.
             However, they work in case they are part of a <%= embedded_link_to 'doc', 'sequence', 'ql', 'poliqarp-plus#syntagmatic-operators-sequence'%>.</p>
         </blockquote>
       </section>
       <section id="spans">
         <h3>Span Segments</h3>
         <p>Not all segments are bound to words - some are bound to concepts
           spanning multiple words, for example noun phrases, sentences, or
           paragraphs. Span segments are structural elements and they have
           specific syntax in different contexts. When used in complex segments,
           they need to be searched by using angular brackets :

           %= doc_query cqp => loc('Q_cqp_spansegm', '<corenlp/c=NP>'), cutoff => 1

           Some spans such as <code>s</code> are special keywords that can be
           used without angular brackets, as operands of specific functional
           operators like <code>within</code>, <code>region</code>, <code>lbound</code>,
           <code>rbound</code> or <code>MU(meet)</code>.

           <!-- EM
           but when used with specific functional
           operators like <code>within</code>, <code>region</code>, <code>lbound</code>,
           <code>rbound</code> or <code>MU(meet)</code>, the angular brackets
           are not mandatory.
           -->
         </p>
       </section>
       <section id="paradigmatic-operators">
         <h3>Paradigmatic Operators</h3>
         <p>A complex segment can have multiple properties a token requires. For
           example to search for all words with a certain surface form of a
           particular lemma (no matter if capitalized or not), you can search
           for:</p>

   %= doc_query cqp => loc('Q_cqp_parseg', '[orth="laufe"%c & base="Lauf"]'), cutoff => 1


         <p>The ampersand combines multiple properties with a logical AND. Terms
           of the complex segment can be negated as introduced before. The
           following queries are equivalent:</p>

   %= doc_query cqp => loc('Q_cqp_parsegamp1', '[orth="laufe"%c & base!="Lauf"]'), cutoff => 1


   %= doc_query cqp => loc('Q_cqp_parsegamp2', '[orth="laufe"%c & !base="Lauf"]'), cutoff => 1


         <p>Alternatives can be expressed by using the pipe symbol:</p>

   %= doc_query cqp => loc('Q_cqp_parsegalt', '[base="laufen" | base="gehen"]'), cutoff => 1


         <p>All these sub expressions can be grouped using round brackets to form
           complex boolean expressions:</p>

   %= doc_query cqp => loc('Q_cqp_parsegcb', '[(base="laufen" | base="gehen") & tt/pos="VVFIN"]'), cutoff => 1


         Round brackets can also be used to encapsulate simple segments, to
         increase query readability, although they are not necessary:

   %= doc_query cqp => loc('Q_cqp_parsegrb', '[(base="laufen" | base="gehen") & (tt/pos="VVFIN")]'), cutoff => 1


         Negation operator can be used outside expressions grouped by round
         brackets. Be aware of the <%= ext_link_to "De
       Morgan's Laws", "https://en.wikipedia.org/wiki/De_Morgan%27s_laws" %> when you design your queries: the following query

   %= doc_query cqp => loc('Q_cqp_parsegneg1', '[(!(base="laufen" | base="gehen")) & (tt/pos="VVFIN")]'), cutoff => 1


         <a>is logically equivalent to:</a>

   %= doc_query cqp => loc('Q_cqp_parsegneg2', '[!(base="laufen") & !(base="gehen") & (tt/pos="VVFIN")]'), cutoff => 1


         <a>which can be written in a more simple way like:</a>

   %= doc_query cqp => loc('Q_cqp_parsegneg3', '[!base="laufen" & !base="gehen" & tt/pos="VVFIN"]'), cutoff => 1


         <a> or like </a>:

   %= doc_query cqp => loc('Q_cqp_parsegneg4', '[base!="laufen" & base!="gehen" & tt/pos="VVFIN"]'), cutoff => 1


       </section>
       <section id="syntagmatic-operators">
         <h3>Syntagmatic Operators</h3>
         <h4 id="syntagmatic-operators-sequence">Sequences</h4>
         <p>Sequences can be used to search for segments in order. For this,
           simple expressions are separated by whitespaces.</p>

   %= doc_query cqp => loc('Q_cqp_syntop1', '"der" "alte" "Mann"'), cutoff => 1


         <p>However, you can obviously search using complex segments as well:</p>

   %= doc_query cqp => loc('Q_cqp_syntop2', '[orth="der"][orth="alte"][orth="Mann"]'), cutoff => 1


         <p>Now you may see the benefit of the empty segment to search for words
           you don't know:</p>

   %= doc_query cqp => loc('Q_cqp_syntop3', '[orth="der"][][orth="Mann"]'), cutoff => 1


         <h4>Position</h4>
         <p>You are also able to mix segments and spans in sequences. In CQP,
           spans are marked by XML-like structural elements signalling the
           beginning and/or the end of a region and they can be used to look for
           segments in a specific position in a bigger structure like a noun
           phrase or a sentence.</p>
         <p>To search for a word at the beginning of a sentence (or a syntactic
           group), the following queries are equivalent.
         <ul>
           <li>
           The queries both match the word "Der" when positioned as a first word in a sentence:
           %= doc_query cqp => loc('Q_cqp_posfirst1', '<base/s=s>[orth="Der"]'), cutoff => 1
           %= doc_query cqp => loc('Q_cqp_posfirst2','<s>[orth="Der"]'), cutoff => 1
           </li>
           <li>The queries both match the word "Der" when positioned after the end of a sentence:
           %= doc_query cqp => loc('Q_cqp_posaend1','</base/s=s>[orth="Der"]'), cutoff => 1
           %= doc_query cqp => loc('Q_cqp_posaend2','</s>[orth="Der"]'), cutoff => 1
         </li>
         </ul>
         To search for a word at the end of a sentence (or a syntactic group),
         you can use:<br>
         <ul>
           <li>Match the word "Mann"
             when positioned as a last word in a sentence: </li>

             %= doc_query cqp => loc('Q_cqp_posend1','[orth="Mann"]</base/s=s>'), cutoff => 1
             %= doc_query cqp => loc('Q_cqp_posend2','[orth="Mann"]</s>'), cutoff => 1

           <li>Match the
             word "Mann" when positioned before the beginning of a sentence, as a
             last word of the previous sentence: </li>

             %= doc_query cqp => loc('Q_cqp_posbbeg1','[orth="Mann"]<base/s=s>'), cutoff => 1
             %= doc_query cqp => loc('Q_cqp_posbbeg2','[orth="Mann"]<s>'), cutoff => 1

         </ul>
         <blockquote class="warning">
         <p>Beware that when searching for longer sequences, sentence boundaries may be crossed. </p>
         </blockquote>
         <p> In the following example, sequences where "für" occurs in a previous
             sentence may also be matched, because of the long sequence of empty
             tokens in the query (minimum 20, maximum 25).
         </p>

         %= doc_query cqp => loc('Q_cqp_posbbeg3', '"für" []{20,25} "uns"</s>'), cutoff => 1

       </section>
	% layout 'main', title => 'KorAP: CQP';

	%= page_title

	<p>The following documentation introduces all features provided by our
	version of the CQP Query Language and some KorAP specific extensions.
	This tutorial is based on the IMS Open Corpus Workbench (CWB)

	<%= ext_link_to 'CQP Query Language Tutorial, version 3.4.24 (May 2020)',"https://cwb.sourceforge.io/files/CQP_Manual.pdf" %>
	and on
	<%= embedded_link_to 'doc', 'the Korap Poliqarp+ tutorial', 'ql', 'poliqarp-plus' %>.</p>

	<section id="segments">
	<h3>Simple Segments</h3>
	<p>The atomic elements of CQP queries are segments. Most of the time,
	segments represent words and can be queried by encapsulating them in
	qoutes or double quotes:</p>


	%= doc_query cqp => loc('Q_cqp_simplesquote', "** 'Tree'")

	<p>or</p>

	%= doc_query cqp => loc('Q_cqp_simpledquote', '** "Tree"')

	<p>A word segment is always interpreted as a <%= embedded_link_to 'doc', 'regular expressions', 'ql', 'regexp' %>, e.g., a query like</p>

	%= doc_query cqp => loc('Q_cqp_re', '** "r(u\|a)n"'), cutoff => 1

	%# <p>can return both "Tannenbaum" and "baum".</p>

	<p>Sequences of simple segments are expressed using a space delimiter:</p>

	%= doc_query cqp => loc('Q_cqp_simpleseq1', '** "the" "Tree"')

	%= doc_query cqp => loc('Q_cqp_simpleseq2', "** 'the' 'Tree'")

	%# ------------------- Current state (ND)

	<p>Originally, CQP was developped as a corpus query processor tool and
	any CQP command had to be followed by a semicolon. <%= ext_link_to 'The CQPweb server', "https://cqpweb.lancs.ac.uk/" %> at
	Lancaster treats the semicolon as optional, and we implemented it in
	the same way.</p>
	<p>Simple segments always refer to the surface form of a word. To search
	for surface forms without case sensitivity, you can use the <code>%c</code>
	flag:</p>


	%= doc_query cqp => loc('Q_cqp_simplescflag', '"laufen"%c'), cutoff => 0



	<p>The query above will find all occurrences of the term irrespective of
	the capitalization of letters.</p>

	<p>Diacritics is not been supported yet.</p>

	<!-- EM
	<p>To ignore diacritics, you can use the <code>%d</code> flag:</p>


	%= doc_query cqp => loc('Q_cqp_simplesidia2', '"Fraulein"%d'), cutoff => 0



	<p>The query above will find all occurrences of the term irrespective of
	the use of diacritics (i.e., <code>Fräulein</code> and <code>Fraulein</code>).</p>

	<p>Flags can be combined to ignore bose case sensitivity and diacritics:</p>


	%= doc_query cqp => loc('Q_cqp_simplesegidia2', '"Fraulein"%cd'), cutoff => 0

	<p>The query above will find all occurrences of the term irrespective of
	the use of diacritics or of capitalization: <code>fraulein</code>, <code>Fraulein</code>,
	<code>fräulein</code>, etc.</p>
	-->

	<h4 id="regexp">Regular Expressions</h4>
	<p>Special regular expressions characters like <code>.</code>, <code>?</code>,
	<code>*</code>, <code>+</code>, <code>\|</code>, <code>(</code>, <code>)</code>,
	<code>[</code>, <code>]</code>, <code>{</code>, <code>}</code>, <code>^</code>,
	<code>$</code> have to be "escaped" with backslash (<code>\</code>):</p>
	<ul>
	<li><code>"?";</code> fails while <code>"\?";</code> returns <code>?.</code></li>
	<li><code>"."</code> returns any character, while <code>"\$\."</code>
	returns <code>$.</code></li>
	</ul>
	<blockquote class="warning">
	<p>Beware: Queries with prepended <code>.*</code> expressions can
	become extremely slow!</p>
	<p>In Poliqarp+ only double quotes are used for regular expressions,
	while single quotes are used to mark verbatim strings. In CQP, you
	can use %l flag to match the string in a verbatim manner.</p>
	</blockquote>
	<p>To match a word form containing single or double quotes, use one of
	the following methods :</p>
	<ul>
	<li>if the string you need to match contain either single or
	double quotes, use the other quote character to encapsulate the
	string: </li>

	%= doc_query cqp => loc('Q_cqp_regexqu1', '"It\'s"'), cutoff => 0

	<!-- EM
	%= doc_query cqp => loc('Q_cqp_xxxx', '\'12"-screen\''), cutoff => 0
	-->

	<li>escape the qoutes by doubling every occurrence of the quotes
	character inside the string: </li>

	%= doc_query cqp => loc('Q_cqp_regexequ1', '\'It\'\'s\''), cutoff => 0

	<!-- %= doc_query cqp => loc('Q_cqp_regexequ2', '"12""-screen"'), cutoff => 0 -->

	<li>escape the qoutes by using <code>(\)</code>: </li>

	%= doc_query cqp => loc('Q_cqp_regexequ3', "'It\\'s'"), cutoff => 0

	<!-- %= doc_query cqp => loc('Q_cqp_regexequ4', '"12\\"-screen"'), cutoff => 0 -->
	</ul>
	</section>
	<section id="complex">
	<h3>Complex Segments</h3>
	<p>Complex segments are expressed in square brackets and contain
	additional information on the resource of the term under scrutiny by
	providing key/value pairs, separated by an equal-sign.</p>
	<p>The KorAP implementation of CQP provides three special segment keys:
	<code>orth</code> for surface forms, <code>base</code> for lemmata,
	and <code>pos</code> for Part-of-Speech. The following complex query
	finds all surface forms of the defined word:</p>

	%= doc_query cqp => loc('Q_cqp_compsl1', '[orth="Baum"]'), cutoff => 0


	<p>The query is thus equivalent to:</p>

	%= doc_query cqp => loc('Q_cqp_compsl2', '"Baum"'), cutoff => 0


	<p>Complex segments expect simple expressions as values, meaning that
	the following expression is valid as well:</p>

	%= doc_query cqp => loc('Q_cqp_compsse', '[orth="l(au\|ie)fen"%c]'), cutoff => 1


	<p>Another special key is <code>base</code>, refering to the lemma
	annotation of the <%= embedded_link_to 'doc', 'default foundry', 'data', 'annotation'%>. The following query finds all occurrences of segments
	annotated as a specified lemma by the default foundry:</p>

	%= doc_query cqp => loc('Q_cqp_compsbase', '[base="Baum"]'), cutoff => 1


	<p>The third special key is <code>pos</code>, refering to the
	part-of-speech annotation of the <%= embedded_link_to 'doc', 'default foundry', 'data', 'annotation'%>. The following query finds all attributive adjectives:</p>

	%= doc_query cqp => loc('Q_cqp_compspos', '[pos="ADJA"]'), cutoff => 1


	<p>Complex segments requesting further token annotations can have keys
	following the <code>foundry/layer</code> notation. For example to
	find all occurrences of plural words in a supporting foundry, you can
	search using the following queries:</p>

	%= doc_query cqp => loc('Q_cqp_compstoken1', '[marmot/m="number":"pl"]'), cutoff => 1


	%= doc_query cqp => loc('Q_cqp_compstoken2', "[marmot/m='tense':'pres']"), cutoff => 1


	<p>In case an annotation contains special non-alphabetic and non-numeric
	characters, the annotation part can be followed by <code>%l</code> to
	ensure a verbatim interpretation:</p>

	%= doc_query cqp => loc('Q_cqp_compstokenverb', "[orth='https://de.wikipedia.org'%l]"), cutoff => 1


	<h4>Negation</h4>
	<p>Negation of terms in complex expressions can be expressed by
	prepending the equal sign or the whole expression with an exclamation
	mark.</p>

	%= doc_query cqp => loc('Q_cqp_neg1', '[pos!="ADJA"] "Haare"'), cutoff => 1



	%= doc_query cqp => loc('Q_cqp_neg2', '[!pos="ADJA"] "Haare"'), cutoff => 1


	<blockquote class="warning">
	<p>Beware: Negated complex segments can't be searched as a single
	statement. However, they work in case they are part of a <%= embedded_link_to 'doc', 'sequence', 'ql', 'poliqarp-plus#syntagmatic-operators-sequence'%>.</p>
	</blockquote>
	<h4 id="empty-segments">Empty Segments</h4>
	<p>A special segment is the empty segment, that matches every word in
	the index.</p>

	%= doc_query cqp => loc('Q_cqp_empseq', '[]'), cutoff => 1


	<p>Empty segments are useful to express distances of words by using
	<%= embedded_link_to 'doc', 'repetitions', 'ql', 'poliqarp-plus#syntagmatic-operators-repetitions'%>.</p>
	<blockquote class="warning">
	<p>Beware: Empty segments can't be searched as a single statement.
	However, they work in case they are part of a <%= embedded_link_to 'doc', 'sequence', 'ql', 'poliqarp-plus#syntagmatic-operators-sequence'%>.</p>
	</blockquote>
	</section>
	<section id="spans">
	<h3>Span Segments</h3>
	<p>Not all segments are bound to words - some are bound to concepts
	spanning multiple words, for example noun phrases, sentences, or
	paragraphs. Span segments are structural elements and they have
	specific syntax in different contexts. When used in complex segments,
	they need to be searched by using angular brackets :

	%= doc_query cqp => loc('Q_cqp_spansegm', '<corenlp/c=NP>'), cutoff => 1

	Some spans such as <code>s</code> are special keywords that can be
	used without angular brackets, as operands of specific functional
	operators like <code>within</code>, <code>region</code>, <code>lbound</code>,
	<code>rbound</code> or <code>MU(meet)</code>.

	<!-- EM
	but when used with specific functional
	operators like <code>within</code>, <code>region</code>, <code>lbound</code>,
	<code>rbound</code> or <code>MU(meet)</code>, the angular brackets
	are not mandatory.
	-->
	</p>
	</section>
	<section id="paradigmatic-operators">
	<h3>Paradigmatic Operators</h3>
	<p>A complex segment can have multiple properties a token requires. For
	example to search for all words with a certain surface form of a
	particular lemma (no matter if capitalized or not), you can search
	for:</p>

	%= doc_query cqp => loc('Q_cqp_parseg', '[orth="laufe"%c & base="Lauf"]'), cutoff => 1


	<p>The ampersand combines multiple properties with a logical AND. Terms
	of the complex segment can be negated as introduced before. The
	following queries are equivalent:</p>

	%= doc_query cqp => loc('Q_cqp_parsegamp1', '[orth="laufe"%c & base!="Lauf"]'), cutoff => 1



	%= doc_query cqp => loc('Q_cqp_parsegamp2', '[orth="laufe"%c & !base="Lauf"]'), cutoff => 1


	<p>Alternatives can be expressed by using the pipe symbol:</p>

	%= doc_query cqp => loc('Q_cqp_parsegalt', '[base="laufen" \| base="gehen"]'), cutoff => 1


	<p>All these sub expressions can be grouped using round brackets to form
	complex boolean expressions:</p>

	%= doc_query cqp => loc('Q_cqp_parsegcb', '[(base="laufen" \| base="gehen") & tt/pos="VVFIN"]'), cutoff => 1


	Round brackets can also be used to encapsulate simple segments, to
	increase query readability, although they are not necessary:

	%= doc_query cqp => loc('Q_cqp_parsegrb', '[(base="laufen" \| base="gehen") & (tt/pos="VVFIN")]'), cutoff => 1


	Negation operator can be used outside expressions grouped by round
	brackets. Be aware of the <%= ext_link_to "De
	Morgan's Laws", "https://en.wikipedia.org/wiki/De_Morgan%27s_laws" %> when you design your queries: the following query

	%= doc_query cqp => loc('Q_cqp_parsegneg1', '[(!(base="laufen" \| base="gehen")) & (tt/pos="VVFIN")]'), cutoff => 1


	<a>is logically equivalent to:</a>

	%= doc_query cqp => loc('Q_cqp_parsegneg2', '[!(base="laufen") & !(base="gehen") & (tt/pos="VVFIN")]'), cutoff => 1


	<a>which can be written in a more simple way like:</a>

	%= doc_query cqp => loc('Q_cqp_parsegneg3', '[!base="laufen" & !base="gehen" & tt/pos="VVFIN"]'), cutoff => 1


	<a> or like </a>:

	%= doc_query cqp => loc('Q_cqp_parsegneg4', '[base!="laufen" & base!="gehen" & tt/pos="VVFIN"]'), cutoff => 1


	</section>
	<section id="syntagmatic-operators">
	<h3>Syntagmatic Operators</h3>
	<h4 id="syntagmatic-operators-sequence">Sequences</h4>
	<p>Sequences can be used to search for segments in order. For this,
	simple expressions are separated by whitespaces.</p>

	%= doc_query cqp => loc('Q_cqp_syntop1', '"der" "alte" "Mann"'), cutoff => 1


	<p>However, you can obviously search using complex segments as well:</p>

	%= doc_query cqp => loc('Q_cqp_syntop2', '[orth="der"][orth="alte"][orth="Mann"]'), cutoff => 1


	<p>Now you may see the benefit of the empty segment to search for words
	you don't know:</p>

	%= doc_query cqp => loc('Q_cqp_syntop3', '[orth="der"][][orth="Mann"]'), cutoff => 1


	<h4>Position</h4>
	<p>You are also able to mix segments and spans in sequences. In CQP,
	spans are marked by XML-like structural elements signalling the
	beginning and/or the end of a region and they can be used to look for
	segments in a specific position in a bigger structure like a noun
	phrase or a sentence.</p>
	<p>To search for a word at the beginning of a sentence (or a syntactic
	group), the following queries are equivalent.
	<ul>
	<li>
	The queries both match the word "Der" when positioned as a first word in a sentence:
	%= doc_query cqp => loc('Q_cqp_posfirst1', '<base/s=s>[orth="Der"]'), cutoff => 1
	%= doc_query cqp => loc('Q_cqp_posfirst2','<s>[orth="Der"]'), cutoff => 1
	</li>
	<li>The queries both match the word "Der" when positioned after the end of a sentence:
	%= doc_query cqp => loc('Q_cqp_posaend1','</base/s=s>[orth="Der"]'), cutoff => 1
	%= doc_query cqp => loc('Q_cqp_posaend2','</s>[orth="Der"]'), cutoff => 1
	</li>
	</ul>
	To search for a word at the end of a sentence (or a syntactic group),
	you can use:<br>
	<ul>
	<li>Match the word "Mann"
	when positioned as a last word in a sentence: </li>

	%= doc_query cqp => loc('Q_cqp_posend1','[orth="Mann"]</base/s=s>'), cutoff => 1
	%= doc_query cqp => loc('Q_cqp_posend2','[orth="Mann"]</s>'), cutoff => 1

	<li>Match the
	word "Mann" when positioned before the beginning of a sentence, as a
	last word of the previous sentence: </li>

	%= doc_query cqp => loc('Q_cqp_posbbeg1','[orth="Mann"]<base/s=s>'), cutoff => 1
	%= doc_query cqp => loc('Q_cqp_posbbeg2','[orth="Mann"]<s>'), cutoff => 1

	</ul>
	<blockquote class="warning">
	<p>Beware that when searching for longer sequences, sentence boundaries may be crossed. </p>
	</blockquote>
	<p> In the following example, sequences where "für" occurs in a previous
	sentence may also be matched, because of the long sequence of empty
	tokens in the query (minimum 20, maximum 25).
	</p>

	%= doc_query cqp => loc('Q_cqp_posbbeg3', '"für" []{20,25} "uns"</s>'), cutoff => 1

	</section>