Regex guru needed - Blog - Open Source - schlitt.info

schlitt.info - php, photography and private stuff

Regex guru needed

I recently wanted to extract all function/method prototypes from a PHP files, using PCRE. The extraction itself is not a real problem, but for further processing I need every single parameter of each function as a seperate element. Surely it would be no problem, when using 2 regex (a. extract the function prototype, b. extract the parameters), but this would (IMHO) be much more resource intensive than doing all at once.

Here is a (much smaller than the real one) example which describes the problem (see the extended entry for the real regex I developed):

$string = "(aaa bbb ccc ddd)"; $regex = "/\((?: ([a-z]+)\s* )+\)/x"; preg_match($regex, $string, $matches);

The output of a var_dump($matches) looks like this:

array(2) { [0]=> string(17) "(aaa bbb ccc ddd)" [1]=> string(3) "ddd" }

Where I would expect something like

array(5) { [0]=> string(17) "(aaa bbb ccc ddd)" [1]=> string(3) "aaa" [2]=> string(3) "bbb" [3]=> string(3) "ccc" [4]=> string(3) "ddd" }

or

array(2) { [0]=> string(17) "(aaa bbb ccc ddd)" [1]=> array(4) { [0]=> string(3) "aaa" [1]=> string(3) "bbb" [2]=> string(3) "ccc" [3]=> string(3) "ddd" } }

Does anyone know a solution for that (I repeate, in 1 regular expression)?

The original code for extracting functions/methods from PHP files I developed:

$regex = '/(?:function \s*([a-zA-Z0-9_]+)\s* \(\s* (?: ( \$[A-Za-z0-9_]+ (?: \s*=\s* (?: \'[^\']*\' | "[^"]*" | [A-Za-z0-9-.]+ ) ) ) \s*,*\s* )* \) )/x'; $res = preg_match_all($regex, $in, $matches, PREG_SET_ORDER);

If you have any optimizations, please comment to this entry! Thanks!

If you liked this blog post or learned something, please consider using flattr to contribute back: .

Trackbacks

Comments

Add new comment

Fields with bold names are mandatory.