I have the following line of text
Reference=*\G{7B35DDAC-FFE2-4435-8A15-CF5C70F23459}#1.0#0#..\..\..\bin\App Components\AcmeFormEngine.dll#ACME Form Engine
and wish to grab the following as two separate capture groups:
AcmeFormEngine.dll
ACME Form Engine
Can anyone help?
-
using System.Text.RegularExpressions; Regex regex = new Regex( @"\\(?<filename>[\w\.]+)\#(?<comment>[\w ]+)$", RegexOptions.IgnoreCase | RegexOptions.Compiled );
Matthew Scharley : Does the hash really need escaping? What special meaning does it have?Bartek Szabat : Hash is begin of commentGishu : # seems to stand for comment :) I thinkMatthew Scharley : Silly .NET regexes. Fixed mine now.Gishu : this is broken if you have - or _ in the filename -
Regex r = new Regex("\\(.+?)\#(.+?)$");
Non-greedy multiplicities are great.
'$'
: Match the end of the string."\#(.+?)"
: Match everything back from the end of the string till the first '#' character and return that in a capture."\\(.+?)"
: Same again, except with an escaped '\'.Gishu : this doesn't work. '\.' is a valid matchMatthew Scharley : Should be fixed now. silly # comments.Joel Coehoorn : upvote because it's the shorted expression and you explained how/why it works -
If you are sincere of the string format, you can also solve that in an earthbound manner, without regex: Take everything after the last index of '\', and split that at '#'.
Gishu : agree. More readable over a regex in this specific scenario.tvanfosson : And more efficient since we only need to do character comparison and avoid the overhead of the state machine. -
I voted for tomalask's non-regex approach. However if you HAD to do it with regex, I think you need something like this
\\([^\\/?"<>|]+?)\#([^\\/?"<>|]+?)[\r\n]*$
This will allow things like - and _ which are valid in filenames, Its 2 identical groups (each excluding invalid chars for win32 filenames) beginning with a slash, delimited by a # and at the end of the line (the $). Assuming second group is also a valid win32 filename.. I saw some ugly boxes in the matched second group, the [\r\n]* keeps them away.
e.g. F5C70F23459}#1.0#0#..\..\..\bin\App Components\Acme_Form-Engine.dll#ACME Form Engine group#1 => Acme_Form-Engine.dll group#2 => ACME Form Engine
In short this is arcane.. avoid if possible.
0 comments:
Post a Comment