Using Regex To Pass Syntax-valid C++ Declaration/initialization
Solution 1:
No regex in the world will be powerful enough to parse C++ declarations, for the very simple reason that the grammar is severely context-sensitive (and, in all likelihood, is actually undecidable).
For example, using the IsPrime
template defined here, you can write a declaration like
int a = foo<IsPrime<234799>>::typen<1>();
which is syntactically valid if and only if 234799 is prime.
Consider using a different approach to validate C++ (e.g. g++ -fsyntax-only
).
Solution 2:
As nneonneo mentioned, regex is not suitable for the task, but if you want to match the sample strings you have, you can use this:
^(?:\s*[A-Za-z_][A-Za-z0-9]*\s*(?:=\s*(?:[A-Za-z0-9]+(?:[+\/*-][A-Za-z0-9]+)?|"[^"]*"|'[^']*'))?\s*,)*\s*[A-Za-z_][A-Za-z0-9]*\s*(?:=\s*(?:[A-Za-z0-9]+(?:[+\/*-][A-Za-z0-9]+)?|"[^"]*"|'[^']*'))?\s*;
Couple of things I changed from your regex:
Changed
[A-z]
to[A-Za-z]
.Put the
=\s*
'outside' because it was quite repetitive.Added square brackets to the bare
0-9
. I believe it was meant to be a character class.Added letters to the character class
[0-9]
.Changed all the
[^]
to[^"]
and[^']
where appropriate. I'm not too sure what you were trying, but just in case.Added the basic integer operators and digits (and letters for variables) following it
(?:[+/*-][A-Za-z0-9]+)?
.Changed the
*
in the first chacter class after=
to+
to prevent immediate,
after=
.
EDIT:
^(?:\s*[A-Za-z_][A-Za-z0-9_]*\s*(?:=\s*(?:[A-Za-z0-9_]+(?:\s*[+\/*-]\s*[A-Za-z0-9_]+)*|[0-9]+(?:\.[0-9]+)?(?:\s*[+\/*-]\s*[0-9]+(?:\.[0-9]+)?)+|"[^"]*"|'[^']*'))?\s*,)*\s*[A-Za-z_][A-Za-z0-9_]*\s*(?:=\s*(?:[A-Za-z0-9_]+(?:\s*[+\/*-]\s*[A-Za-z0-9_]+)*|[0-9]+(?:\.[0-9]+)?(?:\s*[+\/*-]\s*[0-9]+(?:\.[0-9]+)?)+|"[^"]*"|'[^']*'))?\s*;$
Some more whitespaces allowed and allowed underscore in variable names.
Post a Comment for "Using Regex To Pass Syntax-valid C++ Declaration/initialization"