Home>

I want to know a good extraction method for regular expressions

code
 string str = "Oh oh<r = greeting>greeting</r>";

I would like to extract the characters "greeting" and "greeting" from these characters, but the ideal processing cannot be performed unless the replacement is performed at the end of both. ("" Is not included) I would like to know if there is a good extraction method.

code
            // Take out the greeting
            Match rubyMatch = Regex.Match (str, @ "<r = (. +?)>", RegexOptions.Singleline);
            // Take out the greeting
            Match kanjiMatch = Regex.Match (str, @ ">(. +)<", RegexOptions.Singleline);


I made it now. The rest are replaced to extract specific characters.

code
// Change to blank rather than replace
string ruby ​​= Regex.Replace (rubyMatch.ToString (), @ "<r = (. +)>", "$1", RegexOptions.Singleline);
string kanji = Regex.Replace (kanjiMatch.ToString (), @ ">(. +)<", "$1", RegexOptions.Singleline);
  • Answer # 1

    Is this what you want to know?

    string ruby ​​= rubyMatch.Groups [1] .Value;
    string kanji = kanjiMatch.Groups [1] .Value;


    If you want to do it in one line, simply put it in one line

    string ruby ​​= Regex.Match (str, @ "<r = (. +?)>", RegexOptions.Singleline) .Groups [1] .Value;
    string kanji = Regex.Match (str, @ ">(. +)<", RegexOptions.Singleline) .Groups [1] .Value;


    Whether to retrieve directly using look-ahead and look-back

    string ruby ​​= Regex.Match (str, @ "(?<=<r =). +? (? =>)", RegexOptions.Singleline) .Value;
    string kanji = Regex.Match (str, @ "(?<=>). + (? =<)", RegexOptions.Singleline) .Value;