How to write a regular expression to match torrent header format?

I am trying to match and split a typical tv header:

MyTV.Show.S09E01.HDTV.XviD
MyTV.Show.S10E02.HDTV.XviD
MyTV.Show.901.HDTV.XviD
MyTV.Show.1102.HDTV.XviD

I am trying to break these lines into 3 capture groups for each record: Title, Season, Episode.

I can handle the first 2 quite easily:

^([a-zA-Z0-9.]*)\.S([0-9]{1,2})E([0-9]{1,2}).*$

However, the third and fourth were difficult to break the season and episode. If I could work backwards, that would be easier. For example, using "901", if I could work in the opposite order, this would be the first so that the numbers are the episode number, all that remains before that is the season number.

Does anyone have any tips on how I can break these lines into their respective capture groups?

+5
source share
2 answers

Here is what I will use:

(.*?)\.S?(\d{1,2})E?(\d{2})\.(.*)

Has capture groups:

1: Name
2: Season
3: Episode
4: The Rest

Here is some code in C # (kindly provided by this post ): watch it live

using System;
using System.Text.RegularExpressions;

public class Test
{

    public static void Main()
    {
        string s = @"MyTV.Show.S09E01.HDTV.XviD
            MyTV.Show.S10E02.HDTV.XviD
            MyTV.Show.901.HDTV.XviD
            MyTV.Show.1102.HDTV.XviD";

        Extract(s);

    }

    private static readonly Regex rx = new Regex
        (@"(.*?)\.S?(\d{1,2})E?(\d{2})\.(.*)", RegexOptions.IgnoreCase);

    static void Extract(string text)
    {
        MatchCollection matches = rx.Matches(text);

        foreach (Match match in matches)
        {
            Console.WriteLine("Name: {0}, Season: {1}, Ep: {2}, Stuff: {3}\n",
                match.Groups[1].ToString().Trim(), match.Groups[2], 
                match.Groups[3], match.Groups[4].ToString().Trim());
        }
    }

}

It produces:

Name: MyTV.Show, Season: 09, Ep: 01, Stuff: HDTV.XviD
Name: MyTV.Show, Season: 10, Ep: 02, Stuff: HDTV.XviD
Name: MyTV.Show, Season: 9, Ep: 01, Stuff: HDTV.XviD
Name: MyTV.Show, Season: 11, Ep: 02, Stuff: HDTV.XviD
+15
source

Almost every media file I have ever seen that emerged from a torrent had episodes with two digits. In doing so, you can do E([0-9]{2}).instead and get the expression matching.

, 99,9% . script, , ​​ , . - , , , , . , , . , , , .

+2

All Articles