Boost_tokenizer の変更点

追加された行はこの色です。
削除された行はこの色です。
Boost_tokenizer へ行く。
Boost_tokenizer の差分を削除

[[Programming_C]]

マルチバイト文字列を指定のセパレーターで区切る時，その区切り文字（当然マルチバイト）を残したい．wcstok や strtok は区切り文字を吸収してしまう．そこで，久しぶり Boost に手を出した．


 #include        <iostream>
 #include        <string.h>
 #include <stdio.h>
 #include <wchar.h> 
 #include        <boost/tokenizer.hpp>
       using namespace std;
       using namespace boost;
 
 int  main(){
 
  setlocale(LC_ALL, "");
 
  typedef 
        tokenizer<char_separator<wchar_t>, 
        wstring::const_iterator,
        wstring> wtokenizer;
 
  wstring ss = L"もも！名詞！果物"; 
  char_separator<wchar_t> 
            sep(L"！", L"！", keep_empty_tokens);
  wtokenizer wtok(ss,sep);
  int i=0;
  char str[10];
  
  for(wtokenizer::iterator it =wtok.begin(); 
                           it !=wtok.end(); ++it){
    i = wcstombs(str , it->c_str(), 10 );
    str[strlen(str)] = '\0'; //必要か？
    cout << i << " : " << str     << "\n";
 
  }
  return 0;
 
 }

 $ g++ wtokenizer.cpp
 $  ./a.out 
 6 : もも
 3 : ！
 6 : 名詞
 3 : ！
 6 : 果物
 $ 

使えそう．

[[ここ:http://www5d.biglobe.ne.jp/~y0ka/2005-09-02-3.html]]や[[ここ:http://ml.tietew.jp/cppll/cppll/article/6130]]が，さらに[[ここ:http://homepage2.nifty.com/c-labo/boost_tokenizer.html]]が参考になった．別の意味では，
[[ここ:http://www5d.biglobe.ne.jp/~y0ka/2005-09-02-3.html]]も参考になった．

mingwの場合、パスを通さないのであれば boostの解凍フォルダ内の　boost　フォルダを丸ごと、mingwの　include　フォルダに放り込んでやる。boost解凍フォルダの　libs　を移動させる。