| |
(Slightly altered from the page:
http://www.mysql.com/doc/en/Fulltext_Search.html.)
MySQL (the database this collection is built on) uses a very simple
parser to split text into words. A ``word'' is any sequence of
characters consisting of letters, digits, `'', and
`_'. Any ``word'' that is present in the stopword list
or is just too short is ignored. The default minimum length of
words that will be found by full-text searches is four
characters. Also common words, those that occur in at least 50%
of the documents, are also ignored (e.g., 'in', 'the', 'prime' ...)
We can also perform boolean full-text searches by using
one or more of the nine special characters:
+ - < > ( ) ~ * "
When any one of these characters is present, the the 50% threshold is
not used.
+
- A leading plus sign indicates that this word must be
present in every row returned.
-
- A leading minus sign indicates that this word must not be
present in any row returned.
- By default (when neither plus nor minus is specified) the word is optional,
but the rows that contain it will be rated higher. This mimics the
behavior of
MATCH() ... AGAINST() without the IN BOOLEAN MODE
modifier.
< >
- These two operators are used to change a word's contribution to the
relevance value that is assigned to a row. The
<
operator
decreases the contribution and the > operator increases
it. See
the example below.
( )
- Parentheses are used to group words into subexpressions.
~
- A leading tilde acts as a negation operator, causing the word's contribution
to the row relevance to be negative. It's useful for marking noise
words. A row
that contains such a word will be rated lower than others, but will not be
excluded altogether, as it would be with the
-
operator.
*
- An asterisk is the truncation operator. Unlike the other
operators, it
should be appended to the word, not prepended.
"
- The phrase, that is enclosed in double quotes
", matches only
rows that contain this phrase literally, as it was typed.
And here are some examples:
apple banana
- find rows that contain at least one of these words.
+apple +juice
- ... both words.
+apple MacIntosh
- ... word ``apple'', but rank it higher if it also contain ``MacIntosh''.
+apple -MacIntosh
- ... word ``apple'' but not ``MacIntosh''.
+apple +(>turnover <strudel)
- ... ``apple'' and ``turnover'', or ``apple'' and ``strudel'' (in any order),
but rank ``apple pie'' higher than ``apple strudel''.
apple*
- ... ``apple'', ``apples'', ``applesauce'', and ``applet''.
"some words"
- ... ``some words of wisdom'', but not ``some noise words''.
MySQL Stopwords
MySQL by default does not index the following words:
"a", "a's", "able", "about", "above", "according", "accordingly", "across", "actually", "after",
"afterwords", "again", "against", "ain't", "all", "allow", "allows", "almost", "alone", "along",
"already", "also", "although", "always", "am", "among", "amongst", "an", "and", "another", "any",
"anybody", "anyhow", "anyone", "anything", "anyway", "anyways", "anywhere", "apart", "appear",
"appreciate", "appropriate", "are", "aren't", "around", "as", "aside", "ask", "asking", "associated",
"at", "available", "away", "awfully", "b", "be", "became", "because", "become", "becomes", "becoming",
"been", "before", "beforehand", "behind", "being", "believe", "below", "beside", "besides", "best",
"better", "between", "beyond", "both", "brief", "but", "by", "c", "c'mon", "c's", "came", "can",
"can't", "cannot", "cant", "cause", "causes", "certain", "certainly", "changes", "clearly", "co", "com",
"come", "comes", "concerning", "consequently", "consider", "considering", "contain", "containing",
"contains", "corresponding", "could", "couldn't", "course", "currently", "d", "definitely", "described",
"despite", "did", "didn't", "different", "do", "does", "doesn't", "doing", "don't", "done", "down",
"downwards", "during", "e", "each", "edu", "eg", "eight", "either", "else", "elsewhere", "enough",
"entirely", "especially", "et", "etc", "even", "ever", "every", "everybody", "everyone", "everything",
"everywhere", "ex", "exactly", "example", "except", "f", "far", "few", "fifth", "first", "five",
"followed", "following", "follows", "for", "former", "formerly", "forth", "four", "from", "further",
"furthermore", "g", "get", "gets", "getting", "given", "gives", "go", "goes", "going", "gone", "got",
"gotten", "greetings", "h", "had", "hadn't", "happens", "hardly", "has", "hasn't", "have", "haven't",
"having", "he", "he's", "hello", "help", "hence", "her", "here", "here's", "hereafter", "hereby",
"herein", "hereupon", "hers", "herself", "hi", "him", "himself", "his", "hither", "hopefully", "how",
"howbeit", "however", "i", "i'd", "i'll", "i'm", "i've", "ie", "if", "ignored", "immediate", "in",
"inasmuch", "inc", "indeed", "indicate", "indicated", "indicates", "inner", "insofar", "instead",
"into", "inward", "is", "isn't", "it", "it'd", "it'll", "it's", "its", "itself", "j", "just", "k",
"keep", "keeps", "kept", "know", "knows", "known", "l", "last", "lately", "later", "latter", "latterly",
"least", "less", "lest", "let", "let's", "like", "liked", "likely", "little", "look", "looking",
"looks", "ltd", "m", "mainly", "many", "may", "maybe", "me", "mean", "meanwhile", "merely", "might",
"more", "moreover", "most", "mostly", "much", "must", "my", "myself", "n", "name", "namely", "nd",
"near", "nearly", "necessary", "need", "needs", "neither", "never", "nevertheless", "new", "next",
"nine", "no", "nobody", "non", "none", "noone", "nor", "normally", "not", "nothing", "novel", "now",
"nowhere", "o", "obviously", "of", "off", "often", "oh", "ok", "okay", "old", "on", "once", "one",
"ones", "only", "onto", "or", "other", "others", "otherwise", "ought", "our", "ours", "ourselves",
"out", "outside", "over", "overall", "own", "p", "particular", "particularly", "per", "perhaps",
"placed", "please", "plus", "possible", "presumably", "probably", "provides", "q", "que", "quite", "qv",
"r", "rather", "rd", "re", "really", "reasonably", "regarding", "regardless", "regards", "relatively",
"respectively", "right", "s", "said", "same", "saw", "say", "saying", "says", "second", "secondly",
"see", "seeing", "seem", "seemed", "seeming", "seems", "seen", "self", "selves", "sensible", "sent",
"serious", "seriously", "seven", "several", "shall", "she", "should", "shouldn't", "since", "six", "so",
"some", "somebody", "somehow", "someone", "something", "sometime", "sometimes", "somewhat", "somewhere",
"soon", "sorry", "specified", "specify", "specifying", "still", "sub", "such", "sup", "sure", "t",
"t's", "take", "taken", "tell", "tends", "th", "than", "thank", "thanks", "thanx", "that", "that's",
"thats", "the", "their", "theirs", "them", "themselves", "then", "thence", "there", "there's",
"thereafter", "thereby", "therefore", "therein", "theres", "thereupon", "these", "they", "they'd",
"they'll", "they're", "they've", "think", "third", "this", "thorough", "thoroughly", "those", "though",
"three", "through", "throughout", "thru", "thus", "to", "together", "too", "took", "toward", "towards",
"tried", "tries", "truly", "try", "trying", "twice", "two", "u", "un", "under", "unfortunately",
"unless", "unlikely", "until", "unto", "up", "upon", "us", "use", "used", "useful", "uses", "using",
"usually", "v", "value", "various", "very", "via", "viz", "vs", "w", "want", "wants", "was", "wasn't",
"way", "we", "we'd", "we'll", "we're", "we've", "welcome", "well", "went", "were", "weren't", "what",
"what's", "whatever", "when", "whence", "whenever", "where", "where's", "whereafter", "whereas",
"whereby", "wherein", "whereupon", "wherever", "whether", "which", "while", "whither", "who", "who's",
"whoever", "whole", "whom", "whose", "why", "will", "willing", "wish", "with", "within", "without",
"won't", "wonder", "would", "would", "wouldn't", "x", "y", "yes", "yet", "you", "you'd", "you'll",
"you're", "you've", "your", "yours", "yourself", "yourselves", "z", "zero"
|