<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Huddled Masses &#187; MySQL</title>
	<atom:link href="http://huddledmasses.org/tag/mysql/feed/" rel="self" type="application/rss+xml" />
	<link>http://huddledmasses.org</link>
	<description>You can do more than breathe for free...</description>
	<lastBuildDate>Fri, 27 Apr 2012 05:42:40 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
<cloud domain='huddledmasses.org' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
		<item>
		<title>Searching the PoshCode Repository</title>
		<link>http://huddledmasses.org/searching-the-poshcode-repository/</link>
		<comments>http://huddledmasses.org/searching-the-poshcode-repository/#comments</comments>
		<pubDate>Thu, 10 Jul 2008 12:48:01 +0000</pubDate>
		<dc:creator>Joel 'Jaykul' Bennett</dc:creator>
				<category><![CDATA[Huddled]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PoshCode]]></category>
		<category><![CDATA[Scripts]]></category>
		<category><![CDATA[Searching]]></category>

		<guid isPermaLink="false">http://HuddledMasses.org/?p=565</guid>
		<description><![CDATA[I&#8217;ve been having problems with the search functionality on the PoshCOde repository, and I just thought I&#8217;d throw this up here because I just now solved the biggest problem: ranking. Up until now, the results have not been returned in order of relevance &#8212; this is because the search works using MySQL&#8217;s FULLTEXT BOOLEAN search, [...]]]></description>
			<content:encoded><![CDATA[	<p>I&#8217;ve been having problems with the search functionality on the PoshCOde repository, and I just thought I&#8217;d throw this up here because I just now solved the <strong>biggest</strong> problem: ranking. Up until now, the results have not been returned in order of relevance &#8212; this is because the search works using MySQL&#8217;s <a href="http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html"><span class="caps">FULLTEXT</span> BOOLEAN</a> search, which doesn&#8217;t return in relevance order, nor does it return an extra &#8216;score&#8217; column.</p>

	<p>I&#8217;ve fixed that, and weighted the search so that words in the title count more than words in the code by creating a relevance column by hand:</p>

	<div class="sql code sql" style="font-family:monospace;"><br />
<span style="color: #993333; font-weight: bold;">SELECT</span> <span style="color: #66cc66;">*,</span><br />
<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">1.3</span> <span style="color: #66cc66;">*</span> <span style="color: #66cc66;">&#40;</span>MATCH<span style="color: #66cc66;">&#40;</span>posttitle<span style="color: #66cc66;">&#41;</span> AGAINST <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'keywords'</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">BOOLEAN</span> MODE<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><br />
<span style="color: #66cc66;">+</span> <span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">0.8</span> <span style="color: #66cc66;">*</span> <span style="color: #66cc66;">&#40;</span>MATCH<span style="color: #66cc66;">&#40;</span>description<span style="color: #66cc66;">&#41;</span> AGAINST <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'keywords'</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">BOOLEAN</span> MODE<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><br />
<span style="color: #66cc66;">+</span> <span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">1.0</span> <span style="color: #66cc66;">*</span> <span style="color: #66cc66;">&#40;</span>MATCH<span style="color: #66cc66;">&#40;</span>code<span style="color: #66cc66;">&#41;</span> AGAINST <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'keywords'</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">BOOLEAN</span> MODE<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> relevance <br />
<span style="color: #993333; font-weight: bold;">FROM</span> pastebin <span style="color: #993333; font-weight: bold;">WHERE</span> MATCH <span style="color: #66cc66;">&#40;</span>posttitle<span style="color: #66cc66;">,</span>description<span style="color: #66cc66;">,</span>code<span style="color: #66cc66;">&#41;</span> AGAINST <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'keywords'</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">BOOLEAN</span> MODE<span style="color: #66cc66;">&#41;</span> <br />
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> relevance <span style="color: #993333; font-weight: bold;">DESC</span> <span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">25</span><br />
&nbsp;</div>

	<p>Incidentally, the <span class="caps">FULLTEXT</span> index means that words shorter than 4 characters don&#8217;t count (I&#8217;m going to try to get this changed, but it&#8217;s an option for MySQL, so it has to be changed in the config file) in the meantime you can search for words using the wildcard character, like: SQL* and it sort-of works.  The PoshCode cmdlet actually was adding *&#8216;s to the query (although I&#8217;ve just decided that&#8217;s not a good idea, because it means that queries from the cmdlet appear to have different results than queries on the website.</p>

	<p>MySQL&#8217;s <a href="http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html"><span class="caps">FULLTEXT</span> BOOLEAN</a> search has all sorts of features (and limitations): there is a stopword list, maximum and minimum word lengths, and all sorts of operators for setting word precedence or negating words, or weighting them negatively &#8230; to <span class="caps">REQUIRE</span> that a word be present, it must have a + in front, and in order to mark a word as more important, you have to put > in front, not just put it first&#8230; I&#8217;ve been thinking about trying to apply a few of those tricks myself (eg: put * on words under four characters, and put > on the first 30% of words and < on the last 30% to try to simulate weighting them &#8230;) but my original feeling was that the search is more powerful if you just know that it&#8217;s a fulltext boolean search and can write your queries accordingly.</p>

	<p>If anyone has any ideas for how to improve search in MySQL &#8230; or opinions on whether I should try to apply boolean operators to queries which don&#8217;t already have them &#8230; please let me know.</p>]]></content:encoded>
			<wfw:commentRss>http://huddledmasses.org/searching-the-poshcode-repository/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

