Home>

Not long ago, I saw some friends realize the entry of the pinyin of the product name.Found that his implementation was entered manually,—_— #Comrades, welfare is here!

When this article was first published,Only one implementation is written,Using a Microsoft language pack,But the effect on polyphonic words is not very satisfactory.Even individual words make weird mistakes,So now extending another method,Do it manually.

Method one:use Microsoft language pack

For developers to implement international language conversionProvides Microsoft Visual Studio International Pack. This expansion pack contains Chinese, Japanese, Korean, English and other language packs.And provide methods to achieve mutual conversion, get pinyin, get the number of words, even get the number of strokes and so on.

[This method is not very effective for polysyllabic words.However, this method is relatively simple,Just import the package,Therefore, for those who only need to process individual sentences or do not pay attention to polyphonic characters,You can use this way,After all, it's easy.]

The example here is about inputting Chinese characters.Get its pinyin,The effects of obtaining Pinyin and obtaining the first letter of Pinyin are as follows:

First, go to Microsoft's official website to download the Microsoft Visual Studio International Pack language pack. The download addresses are as follows:

microsoft visual studio international pack 1.0 sr1,microsoft visual studio international feature pack 2.0

After downloading are "vsintlpack1.zip", "vsintlpack2.msi", double-click "vsintlpack2.msi" to install, the path is arbitrary, but remember, because it will be referenced later

After installing "vsintlpack2.msi", unzip "vsintlpack1.zip", which contains seven language packs,

For example, Chinese to Pinyin "chspinyinconv.msi", Simplified and Traditional Chinese to "chtchsconv.msi", etc. .

Here we use "chspinyinconv.msi", after double-clicking the installation successfully, open visual studio, create a new winform project, the form layout is shown in the figure above,

First:Add the language pack reference you just installed:

"D:\ program files (x86) \ microsoft visual studio international pack \ simplified chinese pin-yin conversion library \ chncharinfo.dll"

The default is the C drive, here I installed it on the D drive, and then add the using reference:

1 using microsoft.international.converters.pinyinconverter;//Import Pinyin related

Create a method to get pinyin:

///<summary>
///Chinese characters are converted to Pinyin
///</summary>
///<param name="str" ​​>Chinese characters</param>
///<returns>Quanpin</returns>
public static string getpinyin (string str)
{
 string r=string.empty;
 foreach (char obj in str)
 {
 try
 {
 chinesechar chinesechar=new chinesechar (obj);
 string t=chinesechar.pinyins [0] .tostring ();
 r +=t.substring (0, t.length-1);
 }
 catch
 {
 r +=obj.tostring ();
 }
 }
 return r;
}

Create a method to get the first letter of Chinese Pinyin:

///<summary>
///Chinese characters are converted to the first letter of Pinyin
///</summary>
///<param name="str" ​​>Chinese characters</param>
///<returns>initials</returns>
public static string getfirstpinyin (string str)
{
 string r=string.empty;
 foreach (char obj in str)
 {
 try
 {
 chinesechar chinesechar=new chinesechar (obj);
 string t=chinesechar.pinyins [0] .tostring ();
 r +=t.substring (0, 1);
 }
 catch
 {
 r +=obj.tostring ();
 }
 }
 return r;
}

Then call the above method in the click event of the "Convert Pinyin" button:

//Kanji to Pinyin
private void btn_one_click (object sender, eventargs e)
{
 string source=this.txt_chinesecharacter_one.text.trim ();//get the input source character
 string result=getpinyin (source);//call method,Get pinyin
 this.txt_pinyin_one.text=result;
}

Invoke the above method in the "turn to initials" button click event:

//turn the first letter
private void btn_two_click (object sender, eventargs e)
{
 string source=this.txt_chinesecharacter_one.text.trim ();//get the input source character
 string result=getfirstpinyin (source);//call method,Get pinyin
 this.txt_pinyin_one.text=result;
}

At this point, 80%has been completed, running the program,You will findWhen you click "Convert Pinyin", the result looks like this:

It's not the kind of "gu ying" effect that I started to say. This is because I simply dealt with it when I got Pinyin:

//Kanji to Pinyin
private void btn_one_click (object sender, eventargs e)
{
 string source=this.txt_chinesecharacter_one.text.trim ();//get the input source character
</p>
<p>
string result=string.empty;//result of Pinyin conversion
 string temp=string.empty;//temporary variables used by foreach below
 foreach (char item in source) //walk through each source character
 {
 temp=getpinyin (item.tostring ());//convert each character to Pinyin
 //Processing:Get the first letter of uppercase and the rest of the lowercase
 result +=(string.format ("{0} {1}", temp.substring (0, 1) .toupper (), temp.substring (1) .tolower ()));
 }
</p>
<p>
//string result=getpinyin (source);//call method,Get pinyin
 this.txt_pinyin_one.text=result;
}

Ok, so far, this function has been completed,There are other language pack features,Similar to this,You can use Baidu "Microsoft Visual Studio International Pack", examples of inter-language conversion and functions.

Method two:manual coding

This method is actually not difficult,To put it plainly, according to the unicode encoding value, define the corresponding Pinyin array or collection,This effect is then achieved.

First define the pinyin zone encoding array:

//Define the Pinyin zone encoding array
private static int [] getvalue=new int []
 {
 -20319, -20317, -20304, -20295, -20292, -20283, -20265, -20257, -20242, -20230, -20051, -20036, -20032, -20026, -20002, -19990, -19986, -19982, -19976, -19805, -19784, -19775, -19774, -19763, -19756, -19751, -19746, -19741, -19739, -19728, -19725, -19715, -19540, -19531, -19525, -19515, -19500, -19484, -19479, -19467, -19289, -19288, -19281, -19275, -19270, -19263, -19261, -19249, -19243, -19242, -19238, -19235, -19227, -19224, -19218, -19212, -19038, -19023, -19018, -19006, -19003, -18996, -18977, -18961, -18952, -18783, -18774, -18773, -18763, -18756, -18741, -18735, -18731, -18722, -18710, -18697, -18696, -18526, -18518, -18501, -18490, -18478, -18463, -18448, -18447, -18446, -18239, -18237, -18231, -18220, -18211, -18201, -18184, -18183, -18181, -18012, -17997, -17988, -17970, -17964, -17961, -17950, -17947, -17931, -17928, -17922, -17759, -17752, -17733, -17730, -17721, -17703, -17701, -17697, -17692, -17683, -17676, -17496, -17487, -17482, -17468, -17454, -17433, -17427, -17417, -17202, -17185, -16983, -16970, -16942, -16915, -16733, -16708, -16706, -16689, -16664, -16657, -16647, -16474, -16470, -16465, -16459, -16452, -16448, -16433, -16429, -16427, -16423, -16419, -16412, -16407, -16403, -16401, -16393, -16220, -16216, -16212, -16205, -16202, -16187, -16180, -16171, -16169, -16158, -16155, -15959, -15958, -15944, -15933, -15920, -15915, -15903, -15889, -15878, -15707, -15701, -15681, -15667, -15661, -15659, -15652, -15640, -15631, -15625, -15454, -15448, -15436, -15435, -15419, -15416, -15408, -15394, -15385, -15377, -15375, -15369, -15363, -15362, -15183, -15180, -15165, -15158, -15153, -15150, -15149, -15144, -15143, -15141, -15140, -15139, -15128, -15121, -15119, -15117, -15110, -15109, -14941, -14937, -14933, -14930, -14929, -14928, -14926, -14922, -14921, -14914, -14908, -14902, -14894, -14889, -14882, -14873, -14871, -14857, -14678, -14674, -14670, -14668, -14663, -14654, -14645, -14630, -14594, -14429, -14407, -14399, -14384, -14379, -14368, -14355, -14353, -14345, -14170, -14159, -14151, -14149, -14145, -14140, -14137, -14135, -14125, -14123, -14122, -14112, -14109, -14099, -14097, -14094, -14092, -14090, -14087, -14083, -13917, -13914, -13910, -13907, -13906, -13905, -13896, -13894, -13878, -13870, -13859, -13847, -13831, -13658, -13611, -13601, -13406, -13404, -13400, -13398, -13395, -13391, -13387, -13383, -13367, -13359, -13356, -13343, -13340, -13329, -13326, -13318, -13147, -13138, -13120, -13107, -13096, -13095, -13091, -13076, -13068, -13063, -13060, -12888, -12875, -12871, -12860, -12858, -12852, -12849, -12838, -12831, -12829, -12812, -12802, -12607, -12597, -12594, -12585, -12556, -12359, -12346, -12320, -12300, -12120, -12099, -12089, -12074, -12067, -12058, -12039, -11867, -11861, -11847, -11831, -11798, -11781, -11604, -11589, -11536, -11358, -11340, -11339, -11324, -11303, -11097, -11077, -11067, -11055, -11052, -11045, -11041, -11038, -11024, -11020, -11019, -11018, -11014, -10838, -10832, -10815, -10800, -10790, -10780, -10764, -10587, -10544, -10533, -10519, -10331, -10329, -10328, -10322, -10315, -10309, -10307, ​​-10296, -10281, -10274, -10270, -10262, -10260, -10256, -10254
 };

Then define the Pinyin array:

//define Pinyin array
private static string [] getname=new string []
 {
 "a", "ai", "an", "ang", "ao", "ba", "bai", "ban", "bang", "bao", "bei", "ben", "beng", "bi", "bian", "biao", "bie", "bin", "bing", "bo", "bu", "ba", "cai", "can", "cang", "cao", "ce", "ceng", "cha", "chai", "chan", "chang", "chao", "che", "chen", "cheng", "chi", "chong", "chou", "chu", "chuai", "chuan", "chuang", "chui", "chun", "chuo", "ci", "cong", "cou", "cu", "cuan", "cui", "cun", "cuo", "da", "dai", "dan", "dang", "dao", "de", "deng", "di", "dian", "diao", "die", "ding", "diu", "dong", "dou", "du", "duan", "dui", "dun", "duo", "e", "en", "er", "fa", "fan", "fang", "fei", "fen", "feng", "fo", "fou", "fu", "ga", "gai", "gan", "gang", "gao", "ge", "gei", "gen", "geng", "gong", "gou", "gu", "gua", "guai", "guan", "guang", "gui", "gun", "guo", "ha", "hai", "han", "hang", "hao", "he", "hei", "hen", "heng", "hong", "hou", "hu", "hua", "huai", "huan", "huang", "hui", "hun", "huo", "ji", "jia", "jian", "jiang", "jiao", "jie", "jin", "jing", "jiong", "jiu", "ju", "juan", "jue", "jun", "ka", "kai", "kan", "kang", "kao", "ke", "ken", "keng", "kong", "kou", "ku", "kua", "kuai", "kuan", "kuang", "kui", "kun", "kuo", "la", "lai", "lan", "lang", "lao", "le", "lei", "leng", "li", "lia", "lian", "liang", "liao", "lie", "lin", "ling", "liu", "long", "lou", "lu", "lv", "luan", "lue", "lun", "luo", "ma", "mai", "man", "mang", "mao", "me", "mei", "men", "meng", "mi", "mian", "miao", "mie", "min", "ming", "miu", "mo", "mou", "mu", "na", "nai", "nan", "nang", "nao", "ne", "nei", "nen", "neng", "ni", "nian", "niang", "niao", "nie", "nin", "ning", "niu", "nong", "nu", "nv", "nuan", "nue", "nuo", "o", "ou", "pa", "pai", "pan", "pang", "pao", "pei", "pen", "peng", "pi", "pian", "piao", "pie", "pin", "ping", "po", "pu", "qi", "qia", "qian", "qiang", "qiao", "qie", "qin", "qing", "qiong", "qiu", "qu", "quan", "que", "qun", "ran", "rang", "rao", "re", "ren", "reng", "ri", "rong", "rou", "ru", "ruan", "rui", "run", "ruo", "sa", "sai", "san", "sang", "sao", "se", "sen", "seng", "sha", "shai", "shan", "shang", "shao", "she", "shen", "sheng", "shi", "shou", "shu", "shua", "shuai", "shuan", "shuang", "shui", "shun", "shuo", "si", "song", "sou", "su", "suan", "sui", "sun", "suo", "ta", "tai", "tan", "tang", "tao", "te", "teng", "ti", "tian", "tiao", "tie", "ting", "tong", "tou", "tu", "tuan", "tui", "tun", "tuo", "wa", "wai", "wan", "wang", "wei", "wen", "weng", "wo", "wu", "xi", "xia", "xian", "xiang", "xiao", "xie", "xin", "xing", "xiong", "xiu", "xu", "xuan", "xue", "xun", "ya", "yan", "yang", "yao", "ye", "yi", "yin", "ying", "yo", "yong", "you", "yu", "yuan", "yue", "yun", "za", "zai", "zan", "zang", "zao", "ze", "zei", "zen", "zeng", "zha", "zhai", "zhan", "zhang", "zhao", "zhe", "zhen", "zheng", "zhi", "zhong", "zhou", "zhu", "zhua", "zhuai", "zhuan", "zhuang", "zhui", "zhun", "zhuo", "zi", "zong", "zou", "zu", "zuan", "zui", "zun", "zuo"
 };

Then define the method to convert the string:

///<summary>Chinese characters converted to Pinyin</summary>
 ///<param name="chstr">Chinese character string</param>
 ///<returns>converted pinyin string</returns>
 public string strconverttopinyin (string chstr)
 {
 regex reg=new regex ("^ [\ u4e00- \ u9fa5] $");//Verify whether Chinese characters are entered
 byte [] arr=new byte [2];
 string pystr="";
 int asc=0, m1=0, m2=0;
 char [] mchar=chstr.tochararray ();//Get character array corresponding to Chinese
 for (int j=0;j<mchar.length;j ++)
 {
 //If the input is Chinese characters
 if (reg.ismatch (mchar [j] .tostring ()))
 {
 arr=system.text.encoding.default.getbytes (mchar [j] .tostring ());
 m1=(short) (arr [0]);
 m2=(short) (arr [1]);
 asc=m1 * 256 + m2-65536;
 if (asc>0&&asc<160)
 {
 pystr +=mchar [j];
 }
 else
 {
 switch (asc)
 {
 case -9254:
 pystr +="zhen";break;
 case -8985:
 pystr +="qian";break;
 case -5463:
 pystr +="jia";break;
 case -8274:
 pystr +="ge";break;
 case -5448:
 pystr +="ga";break;
 case -5447:
 pystr +="la";break;
 case -4649:
 pystr +="chen";break;
 case -5436:
 pystr +="mao";break;
 case -5213:
 pystr +="mao";break;
 case -3597:
 pystr +="die";break;
 case -5659:
 pystr +="tian";break;
 default:
 for (int i=(getvalue.length-1);i>= 0;i--)
 {
 if (getvalue [i]<= asc) //Determine if the Chinese phonetic alphabet coding is within the specified range
 {
 pystr +=getname [i];//If it does not exceed the range, get the corresponding pinyin
 break;
 }
 }
 break;
 }
 }
 }
 else //if not Chinese characters
 {
 pystr +=mchar [j] .tostring ();//If it is not a Chinese character, return
 }
 }
 return pystr;//return the obtained Chinese pinyin
 }

Although this method will not be ideal for polyphone recognition,But this method is implemented manually after all,Can be controlled manually,For example, for "parents", the result we want is "jia zhang", but the result is "jia chang",

For such polysyllabic phrases,We can control it separately,For example, define an array of polyphonic words and their corresponding combinations of different phrases,When we convert Pinyin,Judge, if it is a polyphonic word,Then find the corresponding pinyin in its phrase.

It is similar to keyword filtering when we do web development.

c
  • Previous Method for PHP to intercept HTML string and automatically complete HTML tags
  • Next Method for calculating random numbers between arbitrary values