<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Huddled Masses &#187; Searching</title>
	<atom:link href="http://huddledmasses.org/tag/searching/feed/" rel="self" type="application/rss+xml" />
	<link>http://huddledmasses.org</link>
	<description>You can do more than breathe for free...</description>
	<lastBuildDate>Fri, 27 Apr 2012 05:42:40 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
<cloud domain='huddledmasses.org' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
		<item>
		<title>Custom IComparers in PowerShell (and Add-Type for v1)</title>
		<link>http://huddledmasses.org/custom-icomparers-in-powershell-and-add-type-for-v1/</link>
		<comments>http://huddledmasses.org/custom-icomparers-in-powershell-and-add-type-for-v1/#comments</comments>
		<pubDate>Wed, 17 Dec 2008 04:53:44 +0000</pubDate>
		<dc:creator>Joel 'Jaykul' Bennett</dc:creator>
				<category><![CDATA[Huddled]]></category>
		<category><![CDATA[.Net]]></category>
		<category><![CDATA[BinarySearch]]></category>
		<category><![CDATA[PowerShell]]></category>
		<category><![CDATA[Scripting]]></category>
		<category><![CDATA[Searching]]></category>

		<guid isPermaLink="false">http://huddledmasses.org/?p=878</guid>
		<description><![CDATA[Someone asked the question in #PowerShell (on irc.Freenode.net): How do I find an item (in an array) based on one of it&#8217;s properties? Actually, the question was rather more complicated than that. They were importing a bunch of users from a csv file, and wanted to sort them and search them based on a specific [...]]]></description>
			<content:encoded><![CDATA[	<p>Someone asked the question in #PowerShell (on irc.Freenode.net):</p>

	<h3>How do I find an item (in an array) based on one of it&#8217;s properties?</h3>

	<p>Actually, the question was rather more complicated than that. They were importing a bunch of users from a csv file, and wanted to sort them and search them based on a specific column. There are many ways to skin this cat. Imagine that you have a <span class="caps">CSV</span> file, and have imported it, like so:</p>

	<div class="posh code posh" style="font-family:monospace;"><br />
<span style="color: #0066cc; font-style: italic;">Set-<span style="font-style: normal;">Content</span></span> users.<span style="color: #003366;">csv</span> @<span style="color: #009900;">&quot;<br />
LastName, FirstName, UserName, Url<br />
Bennett, Joel, Jaykul, http://HuddledMasses.org<br />
Rottenberg, Hal, HalR9000, http://halr9000.com<br />
Hicks, Jeffrey, SapienScripter, http://blog.sapien.com/<br />
&quot;</span>@<br />
<br />
<span style="color: #660033; font-weight: bold;">$users</span> <span style="color: #66cc66;">=</span> <span style="color: #0066cc; font-style: italic;">Import-<span style="font-style: normal;">Csv</span></span> .\users.<span style="color: #003366;">csv</span></div>

	<p>Now, imagine that the <span class="caps">CSV</span> file has thousands of users in it, and that you need to not only sort the data by first name or last name on demand, but you also need to pull users from the list (by name) on demand.</p>

	<p>These are trivial tasks in PowerShell:<span id="more-878"></span></p>

	<div class="posh code posh" style="font-family:monospace;"><br />
<span style="color: #660033; font-weight: bold;">$users</span> <span style="color: #66cc66;">=</span> <span style="color: #660033; font-weight: bold;">$users</span> <span style="color: #66cc66;">|</span> <span style="color: #660033;">Sort</span> Lastname, Firstname<br />
<span style="color: #666666; font-style: italic;"># or </span><br />
<span style="color: #660033; font-weight: bold;">$users</span> <span style="color: #66cc66;">=</span> <span style="color: #660033; font-weight: bold;">$users</span> <span style="color: #66cc66;">|</span> <span style="color: #660033;">Sort</span> Firstname, Lastname<br />
<br />
<span style="color: #666699; font-weight: bold;">return</span> <span style="color: #660033; font-weight: bold;">$users</span> <span style="color: #66cc66;">|</span> <span style="color: #660033;">Where</span> <span style="color: #333;">&#123;</span> <span style="color: #660033; font-weight: bold;">$_</span>.<span style="color: #003366;">Firstname</span> <span style="color: #000066;">-eq</span> <span style="color: #009900;">&quot;Bob&quot;</span> <span style="color: #333;">&#125;</span> <span style="color: #66cc66;">|</span> <span style="color: #660033;">Select</span> <span style="color: #000066;">-First</span> <span style="color: #cc66cc;">1</span><br />
&nbsp;</div>

	<p>This <em>works</em> but will have some performance issues when the list gets too long, because your search algorithm is &#8230; well, we could say it&#8217;s O(n), or just say it&#8217;s slow, because you&#8217;re basically going through the whole list every time. And I do mean the <span class="caps">WHOLE</span> list. You were looking for &#8220;Bob&#8221; ... but the <code>Where-Object</code> cmdlet doesn&#8217;t know your data is sorted, and doesn&#8217;t know you only want one user, so it&#8217;s going to iterate the whole list, because <code>Select-Object</code> can&#8217;t signal it to stop.</p>

	<p>There are several ways to solve this, but what it comes down to is that you need a (better) search algorithm than Where-Object can implement: simply put, you&#8217;d like to use a binary search, since the data is already sorted.</p>

	<h4>Binary Search</h4>

	<p>I&#8217;m not going to get into this much, but to put it simply: a binary search algorithm is pretty much the fastest search you can have, but it depends on the data being pre-sorted in order.  It works by starting in the middle, and comparing the middle item to the item you are searching for. If what you&#8217;re searching for comes <span class="caps">BEFORE</span> that item, then it discards the second half of the items, and examines the middle item of the first half.  In this way, it eliminates half the items each time, and returns the correct item (if it exists) in roughly (log n) steps.</p>

	<h4>So how do we do that in PowerShell?</h4>

	<p>Well, that&#8217;s easy: Just use the static method: <code>[Array]::BinarySearch</code>&#8230; The only problem is, in order for BinarySearch to work, the items have to be not only sorted, but comparable.  Because we&#8217;re trying to compare PSObjects, you can&#8217;t just compare two items from the $users array and determine the correct order&#8230;</p>

	<p>If you wanted to use <code>[Array]::Sort</code> you could use an overload that lets you pass in a second array (of comparable items) to serve as the keys for the non-comparable items, but this won&#8217;t work for BinarySearch:<br />
<div class="posh code posh" style="font-family:monospace;"><br />
<span style="color: #003366; font-weight: bold;"><span style="color: #333;">&#91;</span><span style="color: #003366; font-weight: bold;">Array</span><span style="color: #333;">&#93;</span></span>::<span style="color: #660033;">Sort</span><span style="color: #333;">&#40;</span> $<span style="color: #333;">&#40;</span>@<span style="color: #333;">&#40;</span><span style="color: #660033; font-weight: bold;">$users</span><span style="color: #66cc66;">|%</span><span style="color: #333;">&#123;</span><span style="color: #660033; font-weight: bold;">$_</span>.<span style="color: #003366;">Firstname</span><span style="color: #66cc66;">+</span><span style="color: #009900;">&quot; &quot;</span><span style="color: #66cc66;">+</span><span style="color: #660033; font-weight: bold;">$_</span>.<span style="color: #003366;">Lastname</span><span style="color: #333;">&#125;</span><span style="color: #333;">&#41;</span><span style="color: #333;">&#41;</span>,<span style="color: #660033; font-weight: bold;">$users</span> <span style="color: #333;">&#41;</span></div>

	<p>So you need what is an &#8220;IComparer.&#8221;  That is, a class which implements a &#8220;Compare&#8221; method which can be used to determine the order of the objects.  Sadly, this requires a custom type &#8212; something you can&#8217;t do in pure PowerShell script, believe it or not.  PowerShell does not have the ability to create custom classes (which is how you can tell it&#8217;s not a real programming language), so when you need a custom type in PowerShell you have to drop to another .Net language like C# or Python. </p>

	<p>In PowerShell 2.0 there is a cmdlet &#8220;Add-Type&#8221; which can take a string representing C# source-code, and compile it in memory for immediate use, but in PowerShell 1.0 you need to have a script version of it, which I called New-Type, to avoid confusion because it only handles one of Add-Type&#8217;s four usage models:</p>

	<p><script type="text/javascript" src="http://PoshCode.org/embed/720"></script></p>

	<p>Once you have Add-Type, you can create a ScriptBlock-based implementation of IComparer, by passing this as a string:</p>

	<div class="posh code posh" style="font-family:monospace;"><span style="color: #0066cc; font-style: italic;">New-<span style="font-style: normal;">Type</span></span> @<span style="color: #009900;">&quot;</span></div><br />
<div class="csharp code csharp" style="font-family:monospace;"><br />
<span style="color: #0600FF;">using</span> <span style="color: #008080;">System</span><span style="color: #008000;">;</span><br />
<span style="color: #0600FF;">using</span> <span style="color: #008080;">System.Collections</span><span style="color: #008000;">;</span><br />
<span style="color: #0600FF;">using</span> <span style="color: #008080;">System.Management.Automation</span><span style="color: #008000;">;</span><br />
&nbsp; &nbsp;<span style="color: #0600FF;">public</span> <span style="color: #0600FF;">sealed</span> <span style="color: #FF0000;">class</span> ScriptBlockComparer <span style="color: #008000;">:</span> IComparer<br />
&nbsp; &nbsp;<span style="color: #000000;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; ScriptBlock _Comparer<span style="color: #008000;">;</span><br />
&nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">public</span> ScriptBlockComparer<span style="color: #000000;">&#40;</span>ScriptBlock Comparer<span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span> ComparerScript <span style="color: #008000;">=</span> Comparer<span style="color: #008000;">;</span> <span style="color: #000000;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">public</span> ScriptBlock ComparerScript<br />
&nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;get <span style="color: #000000;">&#123;</span> <span style="color: #0600FF;">return</span> _Comparer<span style="color: #008000;">;</span> <span style="color: #000000;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;set <span style="color: #000000;">&#123;</span> _Comparer <span style="color: #008000;">=</span> value<span style="color: #008000;">;</span> <span style="color: #000000;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">public</span> <span style="color: #FF0000;">int</span> Compare<span style="color: #000000;">&#40;</span><span style="color: #FF0000;">object</span> x, <span style="color: #FF0000;">object</span> y<span style="color: #000000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #0600FF;">try</span> <span style="color: #000000;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">return</span> <span style="color: #000000;">&#40;</span><span style="color: #FF0000;">int</span><span style="color: #000000;">&#41;</span>ComparerScript.<span style="color: #0000FF;">Invoke</span><span style="color: #000000;">&#40;</span>x, y<span style="color: #000000;">&#41;</span><span style="color: #000000;">&#91;</span><span style="color: #FF0000;">0</span><span style="color: #000000;">&#93;</span>.<span style="color: #0000FF;">BaseObject</span><span style="color: #008000;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #000000;">&#125;</span> <span style="color: #0600FF;">catch</span><span style="color: #000000;">&#40;</span>Exception ex<span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0600FF;">throw</span> <a href="http://www.google.com/search?q=new+msdn.microsoft.com"><span style="color: #008000;">new</span></a> InvalidOperationException<span style="color: #000000;">&#40;</span><span style="color: #666666;">&quot;Comparer Script failed to return an integer!&quot;</span>, ex<span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #000000;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; <span style="color: #000000;">&#125;</span><br />
&nbsp; &nbsp;<span style="color: #000000;">&#125;</span></div><br />
<div class="posh code posh" style="font-family:monospace;"><span style="color: #009900;">&quot;@</span></div>

	<p>So now you have everything you need, and you can write your custom comparer, and do binary search. The one thing you need to be aware of is that when you&#8217;re doing binary search, you&#8217;re going to want to pass in the thing you&#8217;re searching for (ie: the first name) not an Object with a FirstName property. So your binary search needs a case for when the second parameter is a string:</p>

	<div class="posh code posh" style="font-family:monospace;"><br />
<span style="color: #660033; font-weight: bold;">$cmp</span> <span style="color: #66cc66;">=</span> <span style="color: #0066cc; font-style: italic;">New-<span style="font-style: normal;">Object</span></span> ScriptBlockComparer <span style="color: #333;">&#123;</span><br />
<span style="color: #666699; font-weight: bold;">Param</span><span style="color: #333;">&#40;</span><span style="color: #660033; font-weight: bold;">$one</span>, <span style="color: #660033; font-weight: bold;">$two</span><span style="color: #333;">&#41;</span><br />
<span style="color: #666699; font-weight: bold;">if</span><span style="color: #333;">&#40;</span><span style="color: #660033; font-weight: bold;">$two</span> <span style="color: #000066;">-isnot</span> <span style="color: #003366; font-weight: bold;"><span style="color: #333;">&#91;</span><span style="color: #003366; font-weight: bold;">string</span><span style="color: #333;">&#93;</span></span><span style="color: #333;">&#41;</span> <span style="color: #333;">&#123;</span> <span style="color: #666699; font-weight: bold;">throw</span> <span style="color: #009900;">&quot;This comparer expects to compare objects to strings&quot;</span> <span style="color: #333;">&#125;</span><br />
&nbsp; &nbsp;<span style="color: #660033; font-weight: bold;">$first</span>,<span style="color: #660033; font-weight: bold;">$last</span><span style="color: #66cc66;">=</span> <span style="color: #660033; font-weight: bold;">$two</span>.<span style="color: #333399; font-weight: bold; font-style: italic;">Split</span><span style="color: #333;">&#40;</span><span style="color: #009900;">&quot; &quot;</span><span style="color: #333;">&#41;</span><br />
&nbsp; &nbsp;<span style="color: #660033; font-weight: bold;">$ord</span> <span style="color: #66cc66;">=</span> <span style="color: #660033; font-weight: bold;">$one</span>.<span style="color: #003366;">Firstname</span>.<span style="color: #003366;">CompareTo</span><span style="color: #333;">&#40;</span><span style="color: #660033; font-weight: bold;">$first</span><span style="color: #333;">&#41;</span> <br />
&nbsp; &nbsp;<span style="color: #666699; font-weight: bold;">if</span><span style="color: #333;">&#40;</span><span style="color: #660033; font-weight: bold;">$ord</span> <span style="color: #000066;">-eq</span> <span style="color: #cc66cc;">0</span> <span style="color: #000066;">-and</span> <span style="color: #660033; font-weight: bold;">$last</span><span style="color: #333;">&#41;</span> <span style="color: #333;">&#123;</span> <span style="color: #660033; font-weight: bold;">$ord</span> <span style="color: #66cc66;">=</span> <span style="color: #660033; font-weight: bold;">$one</span>.<span style="color: #003366;">LastName</span>.<span style="color: #003366;">CompareTo</span><span style="color: #333;">&#40;</span><span style="color: #660033; font-weight: bold;">$last</span><span style="color: #333;">&#41;</span> <span style="color: #333;">&#125;</span><br />
&nbsp; &nbsp;<span style="color: #666699; font-weight: bold;">return</span> <span style="color: #660033; font-weight: bold;">$ord</span><br />
<span style="color: #333;">&#125;</span><br />
<br />
<span style="color: #666666; font-style: italic;"># Now you can use this to search</span><br />
<span style="color: #660033; font-weight: bold;">$users</span><span style="color: #333;">&#91;</span><span style="color: #003366; font-weight: bold;"><span style="color: #333;">&#91;</span><span style="color: #003366; font-weight: bold;">Array</span><span style="color: #333;">&#93;</span></span>::<span style="color: #003366;">BinarySearch</span><span style="color: #333;">&#40;</span> <span style="color: #660033; font-weight: bold;">$users</span>, <span style="color: #009900;">&quot;Joel&quot;</span>, <span style="color: #660033; font-weight: bold;">$cmp</span> <span style="color: #333;">&#41;</span><span style="color: #333;">&#93;</span><br />
<span style="color: #666666; font-style: italic;"># Or this</span><br />
<span style="color: #660033; font-weight: bold;">$users</span><span style="color: #333;">&#91;</span><span style="color: #003366; font-weight: bold;"><span style="color: #333;">&#91;</span><span style="color: #003366; font-weight: bold;">Array</span><span style="color: #333;">&#93;</span></span>::<span style="color: #003366;">BinarySearch</span><span style="color: #333;">&#40;</span> <span style="color: #660033; font-weight: bold;">$users</span>, <span style="color: #009900;">&quot;Joel Bennett&quot;</span>, <span style="color: #660033; font-weight: bold;">$cmp</span> <span style="color: #333;">&#41;</span><span style="color: #333;">&#93;</span><br />
&nbsp;</div>

	<p>Now, you may not think this is a big deal, so let me just share with you the results of searching a collection of about 49000 users:</p>

<table>
<tr><th>Duration(Sec)</th><th>Commmand</th></tr>
<tr><td>6.74000</td><td>$users = $users | Sort FirstName, LastName</td></tr>
<tr><td>6.68200</td><td>$users | Where { $_.FirstName -eq &#8220;Joel&#8221; }</td></tr>
<tr><td>0.02900</td><td>$users[[Array]::BinarySearch( $users, &#8220;Joel&#8221;, $cmp )</td></tr>
</table>

	<p>What you see here is two things:
	<ol>
		<li>Searching for a single user using <code>Where-Object</code> takes pretty much as long as (re)sorting the whole array, because every single item must be checked.</li>
	</ol>
	<ol>
		<li>Using BinarySearch rocks  <img src='http://huddledmasses.org/wordpress/wp-includes/' alt=':D' class='wp-smiley' /> </li>
	</ol><br />
What you might not notice is that <code>[Array]::BinarySearch</code> returns an index, not the item &#8230; so it&#8217;s particularly useful if you need to find the first of something, etc. because you can just increment the result. </p>

	<p><strong>BinarySearch is not always faster</strong>: If your array is initially unsorted, the requirement to pre-sort the array costs basically the same as any speedup you might get for a single search. Also, because of the call to the script, if the array is <strong>very</strong> small, like the four users in the sample above, it&#8217;s actually faster to just loop through them (20ms versus 24ms on my PC). The BinarySearch starts paying off pretty quickly as you scale up the number of searches and records, but it <strong>does take <em>both</em></strong> more records (the break-even point for just the search is about 10, in this contrived example), and more searches (if you have to eat the cost of the initial sort).</p>

	<p>In any case, New-Type certainly has many other uses, and the <strong>ScriptBlockComparer</strong> can be used for other things as well &#8230; although I will say that as far as I can tell, sorting is almost always better performed with the sort cmdlet than with [Array]::Sort and an IComparer.  <img src='http://huddledmasses.org/wordpress/wp-includes/' alt=';)' class='wp-smiley' /> </p>]]></content:encoded>
			<wfw:commentRss>http://huddledmasses.org/custom-icomparers-in-powershell-and-add-type-for-v1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Searching the PoshCode Repository</title>
		<link>http://huddledmasses.org/searching-the-poshcode-repository/</link>
		<comments>http://huddledmasses.org/searching-the-poshcode-repository/#comments</comments>
		<pubDate>Thu, 10 Jul 2008 12:48:01 +0000</pubDate>
		<dc:creator>Joel 'Jaykul' Bennett</dc:creator>
				<category><![CDATA[Huddled]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PoshCode]]></category>
		<category><![CDATA[Scripts]]></category>
		<category><![CDATA[Searching]]></category>

		<guid isPermaLink="false">http://HuddledMasses.org/?p=565</guid>
		<description><![CDATA[I&#8217;ve been having problems with the search functionality on the PoshCOde repository, and I just thought I&#8217;d throw this up here because I just now solved the biggest problem: ranking. Up until now, the results have not been returned in order of relevance &#8212; this is because the search works using MySQL&#8217;s FULLTEXT BOOLEAN search, [...]]]></description>
			<content:encoded><![CDATA[	<p>I&#8217;ve been having problems with the search functionality on the PoshCOde repository, and I just thought I&#8217;d throw this up here because I just now solved the <strong>biggest</strong> problem: ranking. Up until now, the results have not been returned in order of relevance &#8212; this is because the search works using MySQL&#8217;s <a href="http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html"><span class="caps">FULLTEXT</span> BOOLEAN</a> search, which doesn&#8217;t return in relevance order, nor does it return an extra &#8216;score&#8217; column.</p>

	<p>I&#8217;ve fixed that, and weighted the search so that words in the title count more than words in the code by creating a relevance column by hand:</p>

	<div class="sql code sql" style="font-family:monospace;"><br />
<span style="color: #993333; font-weight: bold;">SELECT</span> <span style="color: #66cc66;">*,</span><br />
<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">1.3</span> <span style="color: #66cc66;">*</span> <span style="color: #66cc66;">&#40;</span>MATCH<span style="color: #66cc66;">&#40;</span>posttitle<span style="color: #66cc66;">&#41;</span> AGAINST <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'keywords'</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">BOOLEAN</span> MODE<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><br />
<span style="color: #66cc66;">+</span> <span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">0.8</span> <span style="color: #66cc66;">*</span> <span style="color: #66cc66;">&#40;</span>MATCH<span style="color: #66cc66;">&#40;</span>description<span style="color: #66cc66;">&#41;</span> AGAINST <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'keywords'</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">BOOLEAN</span> MODE<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><br />
<span style="color: #66cc66;">+</span> <span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">1.0</span> <span style="color: #66cc66;">*</span> <span style="color: #66cc66;">&#40;</span>MATCH<span style="color: #66cc66;">&#40;</span>code<span style="color: #66cc66;">&#41;</span> AGAINST <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'keywords'</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">BOOLEAN</span> MODE<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> relevance <br />
<span style="color: #993333; font-weight: bold;">FROM</span> pastebin <span style="color: #993333; font-weight: bold;">WHERE</span> MATCH <span style="color: #66cc66;">&#40;</span>posttitle<span style="color: #66cc66;">,</span>description<span style="color: #66cc66;">,</span>code<span style="color: #66cc66;">&#41;</span> AGAINST <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'keywords'</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">BOOLEAN</span> MODE<span style="color: #66cc66;">&#41;</span> <br />
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> relevance <span style="color: #993333; font-weight: bold;">DESC</span> <span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">25</span><br />
&nbsp;</div>

	<p>Incidentally, the <span class="caps">FULLTEXT</span> index means that words shorter than 4 characters don&#8217;t count (I&#8217;m going to try to get this changed, but it&#8217;s an option for MySQL, so it has to be changed in the config file) in the meantime you can search for words using the wildcard character, like: SQL* and it sort-of works.  The PoshCode cmdlet actually was adding *&#8216;s to the query (although I&#8217;ve just decided that&#8217;s not a good idea, because it means that queries from the cmdlet appear to have different results than queries on the website.</p>

	<p>MySQL&#8217;s <a href="http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html"><span class="caps">FULLTEXT</span> BOOLEAN</a> search has all sorts of features (and limitations): there is a stopword list, maximum and minimum word lengths, and all sorts of operators for setting word precedence or negating words, or weighting them negatively &#8230; to <span class="caps">REQUIRE</span> that a word be present, it must have a + in front, and in order to mark a word as more important, you have to put > in front, not just put it first&#8230; I&#8217;ve been thinking about trying to apply a few of those tricks myself (eg: put * on words under four characters, and put > on the first 30% of words and < on the last 30% to try to simulate weighting them &#8230;) but my original feeling was that the search is more powerful if you just know that it&#8217;s a fulltext boolean search and can write your queries accordingly.</p>

	<p>If anyone has any ideas for how to improve search in MySQL &#8230; or opinions on whether I should try to apply boolean operators to queries which don&#8217;t already have them &#8230; please let me know.</p>]]></content:encoded>
			<wfw:commentRss>http://huddledmasses.org/searching-the-poshcode-repository/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

