7 * This source file is subject to the new BSD license that is bundled
8 * with this package in the file LICENSE.txt.
9 * It is also available through the world-wide-web at this URL:
10 * http://framework.zend.com/license/new-bsd
11 * If you did not receive a copy of the license and are unable to
12 * obtain it through the world-wide-web, please send an email
13 * to license@zend.com so we can send you a copy immediately.
16 * @package Zend_Search_Lucene
17 * @subpackage Analysis
18 * @copyright Copyright (c) 2005-2009 Zend Technologies USA Inc. (http://www.zend.com)
19 * @license http://framework.zend.com/license/new-bsd New BSD License
20 * @version $Id: Token.php 16541 2009-07-07 06:59:03Z bkarwin $
26 * @package Zend_Search_Lucene
27 * @subpackage Analysis
28 * @copyright Copyright (c) 2005-2009 Zend Technologies USA Inc. (http://www.zend.com)
29 * @license http://framework.zend.com/license/new-bsd New BSD License
31 class Zend_Search_Lucene_Analysis_Token
34 * The text of the term.
41 * Start in source text.
45 private $_startOffset;
55 * The position of this token relative to the previous Token.
57 * The default value is one.
59 * Some common uses for this are:
60 * Set it to zero to put multiple terms in the same position. This is
61 * useful if, e.g., a word has multiple stems. Searches for phrases
62 * including either stem will match. In this case, all but the first stem's
63 * increment should be set to zero: the increment of the first instance
64 * should be one. Repeating a token with an increment of zero can also be
65 * used to boost the scores of matches on that token.
67 * Set it to values greater than one to inhibit exact phrase matches.
68 * If, for example, one does not want phrases to match across removed stop
69 * words, then one could build a stop word filter that removes stop words and
70 * also sets the increment to the number of stop words removed before each
71 * non-stop word. Then exact phrase queries will only match when the terms
72 * occur with no intervening stop words.
76 private $_positionIncrement;
83 * @param integer $start
87 public function __construct($text, $start, $end)
89 $this->_termText = $text;
90 $this->_startOffset = $start;
91 $this->_endOffset = $end;
93 $this->_positionIncrement = 1;
98 * positionIncrement setter
100 * @param integer $positionIncrement
102 public function setPositionIncrement($positionIncrement)
104 $this->_positionIncrement = $positionIncrement;
108 * Returns the position increment of this Token.
112 public function getPositionIncrement()
114 return $this->_positionIncrement;
118 * Returns the Token's term text.
122 public function getTermText()
124 return $this->_termText;
128 * Returns this Token's starting offset, the position of the first character
129 * corresponding to this token in the source text.
132 * The difference between getEndOffset() and getStartOffset() may not be equal
133 * to strlen(Zend_Search_Lucene_Analysis_Token::getTermText()), as the term text may have been altered
134 * by a stemmer or some other filter.
138 public function getStartOffset()
140 return $this->_startOffset;
144 * Returns this Token's ending offset, one greater than the position of the
145 * last character corresponding to this token in the source text.
149 public function getEndOffset()
151 return $this->_endOffset;