Note: This site is currently "Under construction". I'm migrating to a new version of my site building software. Lots of things are in a state of disrepair as a result (for example, footnote links aren't working). It's all part of the process of building in public. Most things should still be readable though.

Find and Capture RegEx Matches In Python

Working on a new thing:

Code

import re

text = "the quick brown fox"

text = re.sub(r'(q\S+) (b\S+)', r'\2 \1', text)


print(text)

Results

the brown quick fox

This is the basic way to find and captures regular expression matches in python.

Code

import re

  string = 'Test Date: 2017-04-05'
  matches = re.search(r"\d\d\d\d-(\d\d)-\d\d", string)

  if matches:
      print("Entire group: ", end="")
      print(matches.group())
      print("Subgroup by index: ", end="")
      print(matches.group(1))

Results

Entire group: 2017-04-05
  Subgroup by index: 04

This is how to do it with a compiled regular expression

Code

import re

  string = 'Test Date: 2017-04-05'

  pattern = re.compile('\d\d\d\d-(\d\d)-\d\d')

  matches = pattern.search(string)

  if matches:
      print("Entire match: ", end="")
      print(matches.group())
      print("Subgroup by index: ", end="")
      print(matches.group(1))

Results

Entire match: 2017-04-05
  Subgroup by index: 04

Old Notes:

Code

with open('/tmp/output-2.csv', 'r') as input2:
    with open('/tmp/output-3.csv', 'w') as output3:
      csv_reader = csv.reader(input2)
      csv_writer = csv.writer(output3, quoting=csv.QUOTE_ALL)
      pattern = re.compile("(\d+)/(\d+)/(\d+)")
      def replacement(m):
        return f"20{m.group(3).zfill(2)}-{m.group(1).zfill(2)}-{m.group(2).zfill(2)}"
      for row in csv_reader:
        row[7] = re.sub(pattern, replacement, row[7])
        row[8] = re.sub(pattern, replacement, row[8])
        row[12] = row[12].replace('-', '')
        row[12] = row[12].replace('%', '')
        row[15] = row[15].replace('%', '')
        csv_writer.writerow(row)