濁点を変換するPHPスクリプト+Emacsでも使える

頂くデザインデータやPDFで、濁音が離れているものが時々あります。その文章をコピペして使えないので、なんとかしようと以前作った単純なPHPプログラムです。

あと mb_convert_kana はどうにも信用できないので、冗長ですが preg_replace を使いました。濁点と半濁点の全角ひらがなカタカナを一体化させます。

フォントによっては見えていないかもしれませんが、それぞれ3種類の濁点、半濁点に対応しています。

濁点変換PHPコード

<?php
// 置換対象の文字を定義
$arr1 = array(
    "/う(゙|゛|゙)/","/ウ(゙|゛|゙)/",
    "/か(゙|゛|゙)/","/き(゙|゛|゙)/","/く(゙|゛|゙)/","/け(゙|゛|゙)/","/こ(゙|゛|゙)/","/カ(゙|゛|゙)/","/キ(゙|゛|゙)/","/ク(゙|゛|゙)/","/ケ(゙|゛|゙)/","/コ(゙|゛|゙)/",
    "/さ(゙|゛|゙)/","/し(゙|゛|゙)/","/す(゙|゛|゙)/","/せ(゙|゛|゙)/","/そ(゙|゛|゙)/","/サ(゙|゛|゙)/","/シ(゙|゛|゙)/","/ス(゙|゛|゙)/","/セ(゙|゛|゙)/","/ソ(゙|゛|゙)/",
    "/た(゙|゛|゙)/","/ち(゙|゛|゙)/","/つ(゙|゛|゙)/","/て(゙|゛|゙)/","/と(゙|゛|゙)/","/タ(゙|゛|゙)/","/チ(゙|゛|゙)/","/ツ(゙|゛|゙)/","/テ(゙|゛|゙)/","/ト(゙|゛|゙)/",
    "/は(゙|゛|゙)/","/ひ(゙|゛|゙)/","/ふ(゙|゛|゙)/","/へ(゙|゛|゙)/","/ほ(゙|゛|゙)/","/ハ(゙|゛|゙)/","/ヒ(゙|゛|゙)/","/フ(゙|゛|゙)/","/ヘ(゙|゛|゙)/","/ホ(゙|゛|゙)/",
    "/は(゚|°|゚)/","/ひ(゚|°|゚)/","/ふ(゚|°|゚)/","/へ(゚|°|゚)/","/ほ(゚|°|゚)/","/ハ(゚|°|゚)/","/ヒ(゚|°|゚)/","/フ(゚|°|゚)/","/ヘ(゚|°|゚)/","/ホ(゚|°|゚)/");

// 置換後の文字を定義
$arr2 = array(
    "ゔ","ヴ",
    "が","ぎ","ぐ","げ","ご","ガ","ギ","グ","ゲ","ゴ",
    "ざ","じ","ず","ぜ","ぞ","ザ","ジ","ズ","ゼ","ゾ",
    "だ","ぢ","づ","で","ど","ダ","ヂ","ヅ","デ","ド",
    "ば","び","ぶ","べ","ぼ","バ","ビ","ブ","ベ","ボ",
    "ぱ","ぴ","ぷ","ぺ","ぽ","パ","ピ","プ","ペ","ポ");

// 置換対象の文字列例
$target = "トッフ°ページ は°るふ°んて";

// 置換実行
$result = preg_replace($arr1,$arr2,$target);

// 出力
echo $result;

Emacsで濁点変換を簡単に使えるようにする

上記コードの始めの行に
#!/usr/bin/php
と書き、21行目部分を
$target = file_get_contents('php://stdin');
と変更して好きな場所に保存し、以下を設定ファイルinit.el(emacs.el)に書き込むと選択範囲をキーボードショートカットですぐに変換できるようになります。

elisp

(defun dakuten ()
  "濁点を変換"
  (interactive)
  (save-excursion
    ;; 上記PHPコードが書かれたファイルまでのパス
    (shell-command-on-region (point) (mark) "~/.emacs.d/php/dakuten" nil t)))
;; この関数をキーボードショートカットに登録
(global-set-key (kbd "C-c 9") 'dakuten)

詳細はEmacs上でPHPコードを実行する方法を参照