∑={C,A,G,T}, L = { w : w starts and ends with Cj, j % 2 = 1; #T(w) = #A(w) - #G(w), #A(w) > #G(w);
#T(w) % 2 = 1}. For example: C3GA2TC3 ∈ L; C3A5T3G2C3 ∈ L; C3GA2TC4 ∉ L because it does not start
and end with the same number of C's; w = C3A5T7G2C3 ∉ L because #T(w) = 7 ≠(#A(w) - #G(w)) = 3;
C2GA2TC2 ∉ L because the length of the starting and ending C regions is not an odd number; and w =
C3GTA3TC3 ∉ L because #T(w) % 2 ≠1}.
Prove that L ∉ RLs using the regular language pumping lemma. Start by defining a string S such
that S ∈ L and |S| ≥ N, N ≥ 1. Please do not use any of the example strings above in your proof. Proofs
based on defining 3 substrings of S, in the style found in the 520 archives, will not be graded (for
example, dividing S into x, y, and z.) Note that for S to be at least N symbols long, it must use N, and/
or an expression derived from N, as a repetition operator. For example, C5GTA2C5 could not work as an
S, even though it is in L, because it is not defined in terms of N — thus it is meaningless to ask if it is
at least as long as N.
Other points to bear in mind:
• The only valid assumption about N is that is at least 1.
• After defining S, the next step is to show that for all non-null substrings R within the first N
symbols of S, there is some way to pump (remove or replicate) R to generate S' ∉ L.
• Your conclusion must state why L ∉ RLs, not why some S' ∉ L.