Tuesday, May 3, 2011

how to extract formating part from a string

How does one extract only the formatting part from a string. For example: if I have a string = "Warning: the error {0} has occurred. {1, 9} module has {2:0%} done." I would like to extract {0}, {1,9} and {2:0%} out into a sting array. Is there a regular expression or something that can be done other than my way of looping the string with Substring indexof '{' and '}' alternately?

From stackoverflow
  • Will some variant of "\{[^}]+\}" not work? Run it through finding the matches and substring'ing out from start to end of the match.

  • new Regex(@"\{[0-9:,% .]+\}");
    

    You may have to tweak/tune it to account for any additional formatting options that you haven't provided in your example.

  • In Java, the Matcher class takes a regex and will return all matching sub-strings.

    For example:

    String str = "Warning: the error {0} has occurred. {1, 9} module has {2:0%} done.";
    
    Matcher matcher = pattern.matcher( "{.*}");
    while (matcher.find()){
        String matched = matcher.group()
        \\do whatever you want with matched
    }
    
  • The following code is different from other answers in that it uses non-greedy matching (".*?"):

        private static void Main(string[] args) {
            const string input = "Warning: the error {0} has occurred. {1, 9} module has {2:0%} done.";
            const string pattern = "{.*?}"; // NOTE: "?" is required here (non-greedy matching).
            var formattingParts = Regex.Matches(input, pattern).Cast<Match>().Where(item => item.Success).Select(item => item.Groups[0].Value);
            foreach (var part in formattingParts) {
                Console.WriteLine(part);
            }
        }
    
    Terry_Brown : +1 I'd started writing something with named matches to help extract each part individually, but the above stopped me - really nice solution
    Colin Burnett : Except doing [^}] in between { and } prevents it from being greedy. Likewise on [0-9:,% .] that Jordan proposed. Ethan's is the only greedy one that will match "{0} ... {1}" from beginning to end.

0 comments:

Post a Comment