How to find file name with latest version in C #

I have a folder filled with dwg files, so I just need to find the latest version of the file or if the file has no versions, then copy it to the directory. For example, here are three files:

ABBIE 08-10 # 6-09H4 FINAL 12-12-2012.dwg
ABBIE 08-10 # 6-09H4 FINAL 2012-12-06_1.dwg
ABBIE 08-10 # 6-09H4 FINALLY 06-12-2012_2.dwg

Note that the difference is that one file has _1 and the other has _2 , so the last file here is _2 . I need to save the last file and copy it to a directory. Some files will not have different versions so that they can be copied. I can’t concentrate on the file creation date or the modified date, because in many cases they are the same, so all I need is a file name. I am sure that there is a more efficient way to do this than what I will post below.

 DirectoryInfo myDir = new DirectoryInfo(@"H:\Temp\Test"); var Files = myDir.GetFiles("*.dwg"); string[] fileList = Directory.GetFiles(@"H:\Temp\Test", "*FINAL*", SearchOption.AllDirectories); ArrayList list = new ArrayList(); ArrayList WithUnderscores = new ArrayList(); string nameNOunderscores = ""; for (int i = 0; i < fileList.Length; i++) { //Try to get just the filename.. string filename = fileList[i].Split('.')[0]; int position = filename.LastIndexOf('\\'); filename = filename.Substring(position + 1); filename = filename.Split('_')[0]; foreach (FileInfo allfiles in Files) { var withoutunderscore = allfiles.Name.Split('_')[0]; withoutunderscore = withoutunderscore.Split('.')[0]; if (withoutunderscore.Equals(filename)) { nameNOunderscores = filename; list.Add(allfiles.Name); } } //If there is a number after the _ then capture it in an ArrayList if (list.Count > 0) { foreach (string nam in list) { if (nam.Contains("_")) { //need regex to grab numeric value after _ var match = new Regex("_(?<number>[0-9]+)").Match(nam); if (match.Success) { var value = match.Groups["number"].Value; var number = Int32.Parse(value); WithUnderscores.Add(number); } } } int removedcount = 0; //Whats the max value? if (WithUnderscores.Count > 0) { var maxval = GetMaxValue(WithUnderscores); Int32 intmax = Convert.ToInt32(maxval); foreach (FileInfo deletefile in Files) { string shorten = deletefile.Name.Split('.')[0]; shorten = shorten.Split('_')[0]; if (shorten == nameNOunderscores && deletefile.Name != nameNOunderscores + "_" + intmax + ".dwg") { //Keep track of count of Files that are no good to us so we can iterate to next set of files removedcount = removedcount + 1; } else { //Copy the "Good" file to a seperate directory File.Copy(@"H:\Temp\Test\" + deletefile.Name, @"H:\Temp\AllFinals\" + deletefile.Name, true); } } WithUnderscores.Clear(); list.Clear(); } i = i + removedcount; } else { //This File had no versions so it is good to be copied to the "Good" directory File.Copy(@"H:\Temp\SH_Plats\" + filename, @"H:\Temp\AllFinals" + filename, true); i = i + 1; } } 
+4
source share
5 answers

I made a decision based on Regex and, nevertheless, was late for the party.

(?<fileName>[A-Za-z0-9-# ]*)_?(?<version>[0-9]+)?\.dwg

this regular expression recognizes the file name and version and splits them into groups, a fairly simple foreach loop to get the most recent files in the dictionary (since I'm lazy), and then you just need to combine the file names again before you get them.

var fileName = file.Key + "_" + file.Value + ".dwg"

full code

 var files = new[] { "ABBIE 08-10 #6-09H4 FINAL 06-12-2012.dwg", "ABBIE 08-10 #6-09H4 FINAL 06-12-2012_1.dwg", "ABBIE 08-10 #6-09H4 FINAL 06-12-2012_2.dwg", "Second File.dwg", "Second File_1.dwg", "Third File.dwg" }; // regex to split fileName from version var r = new Regex( @"(?<fileName>[A-Za-z0-9-# ]*)_?(?<version>[0-9]+)?\.dwg" ); var latestFiles = new Dictionary<string, int>(); foreach (var f in files) { var parsedFileName = r.Match( f ); var fileName = parsedFileName.Groups["fileName"].Value; var version = parsedFileName.Groups["version"].Success ? int.Parse( parsedFileName.Groups["version"].Value ) : 0; if( latestFiles.ContainsKey( fileName ) && version > latestFiles[fileName] ) { // replace if this file has a newer version latestFiles[fileName] = version; } else { // add all newly found filenames latestFiles.Add( fileName, version ); } } // open all most recent files foreach (var file in latestFiles) { var fileToCopy = File.Open( file.Key + "_" + file.Value + ".dwg" ); // ... } 
+1
source

You can use this Linq query with Enumerable.GroupBy , which should work (now tested):

 var allFiles = Directory.EnumerateFiles(sourceDir, "*.dwg") .Select(path => new { Path = path, FileName = Path.GetFileName(path), FileNameWithoutExtension = Path.GetFileNameWithoutExtension(path), VersionStartIndex = Path.GetFileNameWithoutExtension(path).LastIndexOf('_') }) .Select(x => new { x.Path, x.FileName, IsVersionFile = x.VersionStartIndex != -1, Version = x.VersionStartIndex == -1 ? new Nullable<int>() : x.FileNameWithoutExtension.Substring(x.VersionStartIndex + 1).TryGetInt(), NameWithoutVersion = x.VersionStartIndex == -1 ? x.FileName : x.FileName.Substring(0, x.VersionStartIndex) }) .OrderByDescending(x => x.Version) .GroupBy(x => x.NameWithoutVersion) .Select(g => g.First()); foreach (var file in allFiles) { string oldPath = Path.Combine(sourceDir, file.FileName); string newPath; if (file.IsVersionFile && file.Version.HasValue) newPath = Path.Combine(versionPath, file.FileName); else newPath = Path.Combine(noVersionPath, file.FileName); File.Copy(oldPath, newPath, true); } 

Here is the extension method that I use to determine if a string available to int :

 public static int? TryGetInt(this string item) { int i; bool success = int.TryParse(item, out i); return success ? (int?)i : (int?)null; } 

Please note that I do not use only regular expressions, but only string methods.

+1
source

Try

 var files = new My.Computer().FileSystem.GetFiles(@"c:\to\the\sample\directory", Microsoft.VisualBasic.FileIO.SearchOption.SearchAllSubDirectories, "*.dwg"); foreach (String f in files) { Console.WriteLine(f); }; 

NB: Add a link to Microsoft.VisualBasic and use the following line at the beginning of the class:

 using My = Microsoft.VisualBasic.Devices; 

UPDATE

Working sample [verified]:

 String dPath=@ "C:\to\the\sample\directory"; var xfiles = new My.Computer().FileSystem.GetFiles(dPath, Microsoft.VisualBasic.FileIO.SearchOption.SearchAllSubDirectories, "*.dwg").Where(c => Regex.IsMatch(c,@"\d{3,}\.dwg$")); XElement filez = new XElement("filez"); foreach (String f in xfiles) { var yfiles = new My.Computer().FileSystem.GetFiles(dPath, Microsoft.VisualBasic.FileIO.SearchOption.SearchAllSubDirectories, string.Format("{0}*.dwg",System.IO.Path.GetFileNameWithoutExtension(f))).Where(c => Regex.IsMatch(c, @"_\d+\.dwg$")); if (yfiles.Count() > 0) { filez.Add(new XElement("file", yfiles.Last())); } else { filez.Add(new XElement("file", f)); }; }; Console.Write(filez); 
0
source

Can you do this by sorting the string? The only tricky part that I see here is converting the file name to a sortable format. Just replace the line dd-mm-yyyy with yyyymmdd. Then sort the list and pull out the last entry.

0
source

This is what you want the fileList file to contain all the file names

 List<string> latestFiles=new List<string>(); foreach(var groups in fileList.GroupBy(x=>Regex.Replace(x,@"(_\d+\.dwg$|\.dwg$)",""))) { latestFiles.Add(groups.OrderBy(s=>Regex.Match(s,@"\d+(?=\.dwg$)").Value==""?0:int.Parse(Regex.Match(s,@"\d+(?=\.dwg$)").Value)).Last()); } 

lastFiles has a list of all new files.

If fileList is larger, use Threading or PLinq

0
source

All Articles