Skip to main content

Regular Expression

These are the commonly used regular expression syntax.


Basic Regex.
1. Input of literal will match whatever is inputted. It is case sensitive.
For example /s/ will match all the 's' in cats , sands but will not match "S" in Superman.

2.  $ ^ * + ? . ( ) [ ] { } | \ 
The above are special characters , and must be escaped with a backslash. You can safely escape all special characters without errors.

Spaces (" ") are not special characters , will be matched without backslash.

3.  /./
A period means any character. If you want to find "." only . Use backslash /\./

4. Regex will match any pattern you typed on it. If you typed /cat / it will match a "cat" followed by a space " " exactly. This is known as concatenation. The sequence of pattern matters.

5. If you want to choose one or the other,  /(cat|dog|rabbit)/ it will choose either cat , dog or rabbit. This is called Alternation.

6. /launch/i
If you want an case insensitive regex, use an 'i' after the last forward slash.

Character Class
It will match any single occurrence of any pattern inside [] . Almost similar to alternation.

1. Only these ^ \ - [  are special characters inside character class []. They need to escaped with "\" back slash.

2. Range of characters. [a-z] means regex will match any characters from a to z. Use [A-Z] for capital. Don't use [a-Z] or [A-z] as there will be bugs. Use [a-zA-Z0-9] to match all alphanumeric.

3. Range of characters. Don't build ranges [*-^] with non-alphanumeric.

4. Negation . Use  to represent "other than". Eg : [^a-z] means any characters other than a to z.


Character Class - short-cut
Short-cut for character class

1. A period (.) represents any character. (except newline) . If period inside [] like [.] , it represents the literal period.

2. \s = white space character ; \S = non-white space character.
white space is space (" "), tab ("\t"), vertical tab ("\v") , carriage return ("\r") , line feed ("\n") and form feed ("\f").

3. additional short cut
ShortcutMeaning
\dAny decimal digit (0-9)
\DAny character but a decimal digit
\hAny hexadecimal digit (0-9, A-F, a-f) (ruby only)
\HAny character but a hexadecimal digit (ruby only)
4. \w matches word character . \W matches non-word character
word-character is all letters , all numbers and underscore (_).

Anchors
anchors don't match any string. They ensure that the regex only matches a string at a certain place in the string, the beginning or the end of the sentence, the beginning of a line , or a word/non-word boundary.


1.  ^ means beginning of a line. $ is the end of a line.

2. \A means start of the string.  \z means end of the string.

3. \b means word boundary ( |word| ) . Pipe equals boundary. The idea of word boundary is to bound        words.
   
    \B means non-word boundary ( w|o|r|d ) . Pipe equals boundary. It is in between words.

Quantifiers
Quantifiers are useful for matching repeated patterns.

Zero or more
*    it matches 0 or more occurrences of the pattern just to its immediate left.
It really means zero or more. /x*/ will match empty strings because it tries to match zero x or more     x.

You can try (123)* to find 0 or more occurrences of sequence 123.

One or more
+    it matches 1 or more occurrences of the pattern just to its immediate left.


Zero or one
?    it matches 0 or 1 occurrences of the pattern just to its immediate left.

Any number of occurrences
{m} exactly m number of occurrences for the patterns to the left.

{ m, n } m up to n occurrences for the patterns to the left.

{ m, } m or more occurrences for the patterns to the left.



















Comments

Popular posts from this blog

Problem Solving - Refactored

I am going to outline how I approach problem solving. The relative importance and the amount of effort/time required for each is stated as a percentage beside each topic. I borrowed some idea from George Polya's How to Solve It Thoroughly Understand the Problem (30%) When encountering hard problem , you need to deeply understand the problem at hand. Take a paper and list down all known facts and data and what the question is trying to find. Sketch out the problem if applicable. Visualize the problem in your head. A lot of times, we only have to understand the problem well, then the solution will obvious. Have a Plan (20%) You need to have an outline of how you are going to tackle the problem. You need to have a logical pathway that will ultimate produce outcome (nothing to do with coding syntax yet). Without a plan, you are just randomly poking around and got lucky. No hard problem ever gets solved without a plan. Plan using pseudo-code, pen & paper or flowchart. Use wh

My Burnout Experience

I want to share with you my experience of burning out. After registering with Launch School, I am extremely excited about my programming journey. I studied for 10 to 12 hours a day, memorizing fact, trying out practice problems, understanding programming concepts. It was fun and exciting and I love seeing myself growing from nothing in programming to something more. After about 3 months, thing starts to change. I started noticing myself paying less attention to details. I find myself skimming through the course material. I skip "Further Exploration" in the practice problem. I am more interested to study just to pass the assessment rather than truly mastering the concept. It was a gradual burning out process but I continue to study for 10 to 12 hours a day through sheer grit. It felt like doing house chore or working a day job that you don't like. One particular morning I woke up, and I remember this deep feeling of dread because I can anticipate that the next 10 to 1

Sharing my Weakness

It makes sense to know about your weakness and do something about it. Here are my known weaknesses uncovered during my time in Launch School. 1. I don't like to refactor my code   - Your first draft will not be perfect. It works but it may not be efficient/readable/best practices. You final code will almost always be better than your first draft. - It is easier to separate the task between writing code that works and refactor later to make it efficient/readable/best practices. - If you refactor your code often, over time you will discover your bad habits and change it. 2. I don't like to read other people's code - There are more good programming practices in other people than in you (especially for beginners like me). - To be good , you need to know more than one pathways to solve a programming problem (and there are always more than one way). Then you can judge their merit. - Reason for dislikes    1. It is considerably harder to read code than to write one (be