Regex: search group returning only last character


#1

I’m doing a search and replace task using the regular expression
^(1\.\d{2}) - (\w|[ ,();áéíóúÁÉÍÓÚçÇêôÊÔãõÃÕñÑ])+\.$
on a text like this:

1 – Serviços de informática e congêneres.

1.01 – Análise e desenvolvimento de sistemas.

1.02 – Programação.

1.03 – Processamento de dados e congêneres.

1.04 – Elaboração de programas de computadores, inclusive de jogos eletrônicos.

But backref $2 returns only last character from search group. Where am I going wrong?


#2

I assume the problem is that your + lies outside your ().


#3

I would try ^(1\.\d{2}) - ([\w ,();áéíóúÁÉÍÓÚçÇêôÊÔãõÃÕñÑ]+)\.$
Moved the \w inside the set (faster) and the + inside the capture.
Or, perhaps simpler if there are no false positives (if it is discriminant enough):
^(1\.\d{2}) - (.+)\.$
Don’t care about the text itself, just about its environment. Your expression will fail if the sentence has a dash in it, for example…