Skip to main content

Regular Expression

These are the commonly used regular expression syntax.


Basic Regex.
1. Input of literal will match whatever is inputted. It is case sensitive.
For example /s/ will match all the 's' in cats , sands but will not match "S" in Superman.

2.  $ ^ * + ? . ( ) [ ] { } | \ 
The above are special characters , and must be escaped with a backslash. You can safely escape all special characters without errors.

Spaces (" ") are not special characters , will be matched without backslash.

3.  /./
A period means any character. If you want to find "." only . Use backslash /\./

4. Regex will match any pattern you typed on it. If you typed /cat / it will match a "cat" followed by a space " " exactly. This is known as concatenation. The sequence of pattern matters.

5. If you want to choose one or the other,  /(cat|dog|rabbit)/ it will choose either cat , dog or rabbit. This is called Alternation.

6. /launch/i
If you want an case insensitive regex, use an 'i' after the last forward slash.

Character Class
It will match any single occurrence of any pattern inside [] . Almost similar to alternation.

1. Only these ^ \ - [  are special characters inside character class []. They need to escaped with "\" back slash.

2. Range of characters. [a-z] means regex will match any characters from a to z. Use [A-Z] for capital. Don't use [a-Z] or [A-z] as there will be bugs. Use [a-zA-Z0-9] to match all alphanumeric.

3. Range of characters. Don't build ranges [*-^] with non-alphanumeric.

4. Negation . Use  to represent "other than". Eg : [^a-z] means any characters other than a to z.


Character Class - short-cut
Short-cut for character class

1. A period (.) represents any character. (except newline) . If period inside [] like [.] , it represents the literal period.

2. \s = white space character ; \S = non-white space character.
white space is space (" "), tab ("\t"), vertical tab ("\v") , carriage return ("\r") , line feed ("\n") and form feed ("\f").

3. additional short cut
ShortcutMeaning
\dAny decimal digit (0-9)
\DAny character but a decimal digit
\hAny hexadecimal digit (0-9, A-F, a-f) (ruby only)
\HAny character but a hexadecimal digit (ruby only)
4. \w matches word character . \W matches non-word character
word-character is all letters , all numbers and underscore (_).

Anchors
anchors don't match any string. They ensure that the regex only matches a string at a certain place in the string, the beginning or the end of the sentence, the beginning of a line , or a word/non-word boundary.


1.  ^ means beginning of a line. $ is the end of a line.

2. \A means start of the string.  \z means end of the string.

3. \b means word boundary ( |word| ) . Pipe equals boundary. The idea of word boundary is to bound        words.
   
    \B means non-word boundary ( w|o|r|d ) . Pipe equals boundary. It is in between words.

Quantifiers
Quantifiers are useful for matching repeated patterns.

Zero or more
*    it matches 0 or more occurrences of the pattern just to its immediate left.
It really means zero or more. /x*/ will match empty strings because it tries to match zero x or more     x.

You can try (123)* to find 0 or more occurrences of sequence 123.

One or more
+    it matches 1 or more occurrences of the pattern just to its immediate left.


Zero or one
?    it matches 0 or 1 occurrences of the pattern just to its immediate left.

Any number of occurrences
{m} exactly m number of occurrences for the patterns to the left.

{ m, n } m up to n occurrences for the patterns to the left.

{ m, } m or more occurrences for the patterns to the left.



















Comments

Popular posts from this blog

Problem Solving - Refactored

I am going to outline how I approach problem solving. The relative importance and the amount of effort/time required for each is stated as a percentage beside each topic. I borrowed some idea from George Polya's How to Solve It Thoroughly Understand the Problem (30%) When encountering hard problem , you need to deeply understand the problem at hand. Take a paper and list down all known facts and data and what the question is trying to find. Sketch out the problem if applicable. Visualize the problem in your head. A lot of times, we only have to understand the problem well, then the solution will obvious. Have a Plan (20%) You need to have an outline of how you are going to tackle the problem. You need to have a logical pathway that will ultimate produce outcome (nothing to do with coding syntax yet). Without a plan, you are just randomly poking around and got lucky. No hard problem ever gets solved without a plan. Plan using pseudo-code, pen & paper or flowchart. Use wh...

My Burnout Experience

I want to share with you my experience of burning out. After registering with Launch School, I am extremely excited about my programming journey. I studied for 10 to 12 hours a day, memorizing fact, trying out practice problems, understanding programming concepts. It was fun and exciting and I love seeing myself growing from nothing in programming to something more. After about 3 months, thing starts to change. I started noticing myself paying less attention to details. I find myself skimming through the course material. I skip "Further Exploration" in the practice problem. I am more interested to study just to pass the assessment rather than truly mastering the concept. It was a gradual burning out process but I continue to study for 10 to 12 hours a day through sheer grit. It felt like doing house chore or working a day job that you don't like. One particular morning I woke up, and I remember this deep feeling of dread because I can anticipate that the next 10 to 1...

Explain code

get "/" do pattern = File . join ( data_path , "*" ) @files = Dir . glob ( pattern ) . map do | path | File . basename ( path ) end erb :index end def data_path if ENV [ "RACK_ENV" ] == "test" File . expand_path ( "../test/data" , __FILE__ ) else File . expand_path ( "../data" , __FILE__ ) end end data_path will check if ENV hash with key "RACK_ENV" has the value of "test". If yes, then return the path from root to cms2/test/data folder. If not , then return the absolute path from root to the folder cms2/data Then, in get "/" block , join the data_path with * . If in development environment, then data_path is home/cms2/data then the return value is home/cms2/data/* We use File.join is good because it will detect the OS, then join with appropriate character.  With the pattern in place, we use Dir.glob to find the files. Here it return home/...