- Question 1
When we left the lab last Monday, we used backreferencing to
match two html tags (see
this web page,
http://www.regular-expressions.info/named.html).
Write a more complex expression, using two backreferences to
match two sets of html tags, one embedded
in the other.
Get this to produce a match on the string
>>> as = r'<html><title> The spring 2011 foundations class</title></html>'
as well as
>>> as = r'<body><h3> The spring 2011 foundations class</H3></BODY>'
Don't forget to turn off case sensitivity.
Recall that for a single set of tags we used
>>> match = re.search(r'<([A-Z][A-Z0-9]*)[^>]*>(.*?)</\1>', as, re.IGNORECASE)
and typed both
>>> print match(0)
and
>>> print match(1)
to see the results.
Do this in a function definition in python, in your file called relab.py.
- Question 2
Type all of your answers to this part into the
same file, relab.py. Start each line of
text with python's comment symbol, #.
For example:
# \b is a word boundary
# \d{1,3} indicates between 1 and 3 digits
# etc.
In the IP address example on the same examples
web site, http://www.regular-expressions.info/examples.html,
explain exactly how the three regular expressions work:
1. \b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b
2. \b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
(all on one line)
3. \b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
- Question 3
Read http://www.regular-expressions.info/completelines.html,
information on using regular expressions
to find lines of text.
We've already started looking at this, with our brief
description of negative and positive lookahead in the lab
(PatternLab.html).
Write a regular expression that matches a complete line of text that contains
all of the words
"melody", "similarity", and "computer", in any order.
Use the regular expression and examples within
a function definition in your file relab.py.
Describe how your regular expression works,
in detail. Again
# use python's comments to answer the text
# part of this homework.
Don't forget to submit by class time Friday Feb 18:
chmod -R g+r ./
submit relab relab.py