<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Sphinx Search in Japanese</title>
	<atom:link href="http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/feed/" rel="self" type="application/rss+xml" />
	<link>http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/</link>
	<description>What&#039;s better than toast? Crunchytoast!</description>
	<lastBuildDate>Wed, 25 Jan 2012 02:55:23 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: 中国語での全文検索について調べてみた。 &#171; fmob中の人ブログ</title>
		<link>http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/#comment-18</link>
		<dc:creator>中国語での全文検索について調べてみた。 &#171; fmob中の人ブログ</dc:creator>
		<pubDate>Wed, 23 Feb 2011 06:02:33 +0000</pubDate>
		<guid isPermaLink="false">http://crunchytoast.com/?p=131#comment-18</guid>
		<description>[...] で、現時点では日本語に対応しているかなんか微妙な感じっぽくて、いくつか日本語化のための指南がありました。 http://blog.shibu.jp/article/32831225.html http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/ [...]</description>
		<content:encoded><![CDATA[<p>[...] で、現時点では日本語に対応しているかなんか微妙な感じっぽくて、いくつか日本語化のための指南がありました。 <a href="http://blog.shibu.jp/article/32831225.html" rel="nofollow">http://blog.shibu.jp/article/32831225.html</a> <a href="http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/" rel="nofollow">http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/</a> [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gitorious入れたメモ &#171; **deadwinter**</title>
		<link>http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/#comment-17</link>
		<dc:creator>Gitorious入れたメモ &#171; **deadwinter**</dc:creator>
		<pubDate>Wed, 06 Jan 2010 13:36:16 +0000</pubDate>
		<guid isPermaLink="false">http://crunchytoast.com/?p=131#comment-17</guid>
		<description>[...] http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/ [...]</description>
		<content:encoded><![CDATA[<p>[...] <a href="http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/" rel="nofollow">http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/</a> [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: admin</title>
		<link>http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/#comment-16</link>
		<dc:creator>admin</dc:creator>
		<pubDate>Fri, 02 Oct 2009 03:59:03 +0000</pubDate>
		<guid isPermaLink="false">http://crunchytoast.com/?p=131#comment-16</guid>
		<description>@Pat - Sorry about that! It’s all fixed. It was just one of those things.
Would it be any better if I said I’ve been going around talking about Pat Allan the clothing designer??!</description>
		<content:encoded><![CDATA[<p>@Pat &#8211; Sorry about that! It’s all fixed. It was just one of those things.<br />
Would it be any better if I said I’ve been going around talking about Pat Allan the clothing designer??!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: admin</title>
		<link>http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/#comment-15</link>
		<dc:creator>admin</dc:creator>
		<pubDate>Wed, 30 Sep 2009 00:02:14 +0000</pubDate>
		<guid isPermaLink="false">http://crunchytoast.com/?p=131#comment-15</guid>
		<description>@Dave - Thanks for bringing this to my attention. I&#039;ve included the middle dot because I wanted it to be searchable. The middle dot is often used for separating foreign words to match their source language equivalent word separation (esp. for foreign names).

More importantly, the boubiki (long vowel sound marker) character U+30FC is also in this range. See http://www.unicode.org/charts/PDF/Unicode-3.2/U32-30A0.pdf

The middle dot may not be everyone&#039;s cup of tea, but the boubiki (pronounced like &quot;bore-bikkie&quot;) is definitely needed.

I may change my mind about the middle separator dot, because it will probably hinder more searches than it helps, but the &quot;boubiki&quot; is essential.</description>
		<content:encoded><![CDATA[<p>@Dave &#8211; Thanks for bringing this to my attention. I&#8217;ve included the middle dot because I wanted it to be searchable. The middle dot is often used for separating foreign words to match their source language equivalent word separation (esp. for foreign names).</p>
<p>More importantly, the boubiki (long vowel sound marker) character U+30FC is also in this range. See <a href="http://www.unicode.org/charts/PDF/Unicode-3.2/U32-30A0.pdf" rel="nofollow">http://www.unicode.org/charts/PDF/Unicode-3.2/U32-30A0.pdf</a></p>
<p>The middle dot may not be everyone&#8217;s cup of tea, but the boubiki (pronounced like &#8220;bore-bikkie&#8221;) is definitely needed.</p>
<p>I may change my mind about the middle separator dot, because it will probably hinder more searches than it helps, but the &#8220;boubiki&#8221; is essential.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: pat</title>
		<link>http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/#comment-14</link>
		<dc:creator>pat</dc:creator>
		<pubDate>Tue, 29 Sep 2009 22:09:31 +0000</pubDate>
		<guid isPermaLink="false">http://crunchytoast.com/?p=131#comment-14</guid>
		<description>Hey there, great to see someone documenting how they use Sphinx and Thinking Sphinx with Japanese characters. One thing to note, though: my name&#039;s Pat Allan, not Paul Smith :)</description>
		<content:encoded><![CDATA[<p>Hey there, great to see someone documenting how they use Sphinx and Thinking Sphinx with Japanese characters. One thing to note, though: my name&#8217;s Pat Allan, not Paul Smith <img src='http://crunchytoast.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave</title>
		<link>http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/#comment-13</link>
		<dc:creator>Dave</dc:creator>
		<pubDate>Mon, 21 Sep 2009 10:23:27 +0000</pubDate>
		<guid isPermaLink="false">http://crunchytoast.com/?p=131#comment-13</guid>
		<description>Thanks for this!

Just a quick question - you&#039;re including
# Katakana U+30A0..U+30FF

I don&#039;t know much about the Japanese language, but should U+30FB..U+30FF be included? (I&#039;m asking because U+30FB is fullwidth katakana middle dot, and I thought this was used as a &lt;a href=&quot;http://en.wikipedia.org/wiki/Interpunct#Japanese&quot; rel=&quot;nofollow&quot;&gt;word separator&lt;/a&gt;)?</description>
		<content:encoded><![CDATA[<p>Thanks for this!</p>
<p>Just a quick question &#8211; you&#8217;re including<br />
# Katakana U+30A0..U+30FF</p>
<p>I don&#8217;t know much about the Japanese language, but should U+30FB..U+30FF be included? (I&#8217;m asking because U+30FB is fullwidth katakana middle dot, and I thought this was used as a <a href="http://en.wikipedia.org/wiki/Interpunct#Japanese" rel="nofollow">word separator</a>)?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: admin</title>
		<link>http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/#comment-12</link>
		<dc:creator>admin</dc:creator>
		<pubDate>Fri, 01 May 2009 15:55:59 +0000</pubDate>
		<guid isPermaLink="false">http://crunchytoast.com/?p=131#comment-12</guid>
		<description>@richard - please check out my latest post about the &lt;a href=&quot;http://crunchytoast.com/2009/05/01/japanese-sphinx-explained/&quot; rel=&quot;nofollow&quot;&gt;Sphinx Japanese Character Table&lt;/a&gt;. It explains what each section does, and it confirms that hiragana does not match to katakana.

Also I have separately confirmed that straight katakana searching does work! Can you privately send me a sample of your test data, and your setup file. I&#039;ll see what I can do.</description>
		<content:encoded><![CDATA[<p>@richard &#8211; please check out my latest post about the <a href="http://crunchytoast.com/2009/05/01/japanese-sphinx-explained/" rel="nofollow">Sphinx Japanese Character Table</a>. It explains what each section does, and it confirms that hiragana does not match to katakana.</p>
<p>Also I have separately confirmed that straight katakana searching does work! Can you privately send me a sample of your test data, and your setup file. I&#8217;ll see what I can do.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: admin</title>
		<link>http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/#comment-11</link>
		<dc:creator>admin</dc:creator>
		<pubDate>Fri, 01 May 2009 07:27:45 +0000</pubDate>
		<guid isPermaLink="false">http://crunchytoast.com/?p=131#comment-11</guid>
		<description>@richard - I&#039;ll look into this and email you separately. Please note, these mappings treat equivalent Katakana and Hiragana characters as the same character. This means searching for かな will also match カナ and ｶﾅ. So you basically get a phonetic search for any type of kana characters.</description>
		<content:encoded><![CDATA[<p>@richard &#8211; I&#8217;ll look into this and email you separately. Please note, these mappings treat equivalent Katakana and Hiragana characters as the same character. This means searching for かな will also match カナ and ｶﾅ. So you basically get a phonetic search for any type of kana characters.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Richard</title>
		<link>http://crunchytoast.com/2009/04/14/sphinx-search-in-japanese/#comment-10</link>
		<dc:creator>Richard</dc:creator>
		<pubDate>Fri, 01 May 2009 02:59:50 +0000</pubDate>
		<guid isPermaLink="false">http://crunchytoast.com/?p=131#comment-10</guid>
		<description>Thanks for your post, I am currently testing sphinx to use in a large online shopping website in Japanese. I followed your character map guide added it to my sphinx.conf, reindexed. All queries of kanji based words are fine but katakana words are just returning all of the results in the DB? Did you find this? I am using 9.8.1</description>
		<content:encoded><![CDATA[<p>Thanks for your post, I am currently testing sphinx to use in a large online shopping website in Japanese. I followed your character map guide added it to my sphinx.conf, reindexed. All queries of kanji based words are fine but katakana words are just returning all of the results in the DB? Did you find this? I am using 9.8.1</p>
]]></content:encoded>
	</item>
</channel>
</rss>

