Skip to main content
TechNet

IT Toolbox: New Products for IT Professionals

If you need to quickly find duplicate files or perform a pattern-based search for groups of files, this month’s tools can help you out.

Greg Steen

Fast Duplicate File Finder: Free Edition

You know it happens. One of your users e-mails a PDF or Word document to the staff, and everyone saves their own copy. Your Network Attached Storage (NAS) ends up with 100 copies of the same document. The same thing can happen on your personal system. Remember all those photos you meant to organize but never did? Now you have four copies of every JPG you’ve ever saved.

There are platforms like SharePoint that create accessible document repositories and other tools that can help with those types of issues, but not when everyone starts saving copies of the same data that’s already backed up. Space may be cheap these days, but it’s not that cheap—especially when you consider costs for power, backup and heat. One type of utility that can help remedy the situation is a duplicate finder, such as the Fast Duplicate File Finder from Mindgems Inc.

To run your first duplicate scan, simply add the target folders to scan and click “Start Scan.” With the default settings, Fast Duplicate File Finder will compare files of the same extension that are 100 percent equal. You can also compare files based on a percentage of similarity. You can even choose to ignore file extensions.

If you’re scanning a huge set of files, you can save your duplicate finder project and either log off, hibernate, stand by or shut down the machine. Organizing your duplicate hunts into projects lets you easily reuse the template for periodic checks against the same target folder sets. Other options include skipping zero-byte files, auto-minimizing the application when the scan starts, having it protect and avoid system files and folders, and choosing the process priority for the scan. That last option is great for either avoiding having it use all your system resources or giving it maximum resource access so it finishes faster.

Once your scan is complete, you have a couple of different options with Fast Duplicate File Finder. You can have it automatically check possible duplicates based on the timestamp. You can also exclude files it shouldn’t mark as duplicates based on timestamps, extensions or wildcard patterns. You can have the program automatically move the duplicate files—keeping the folder structure—to an alternate location. This lets you archive the duplicate files, just in case. If you’re happy with the results, you can simply delete the duplicates. You can also have it wipe out any empty folders it finds.

The program presents comparison results in the main window with the filename, folder, size, timestamp, similarity percentage and duplicate group number. This data helps you identify which files you can really remove. There’s also a preview pane that will show you file details such as text-file contents and images. This way you can stay in the application as you verify the duplicates. You can also right-click on an item to see its standard associated program. This is helpful for binary files that don’t lend themselves to a quick preview.

By default, Fast Duplicate File Finder will check duplicates off to be deleted. You can select or de-select your own items. These are highlighted in a different color, so you can easily find what you changed before you commit your move or delete action on the duplicates.

Fast Duplicate File Finder is free and runs on pretty much all flavors of the Windows OS. You can download it directly from the Web site. There’s also a paid version, Fast Duplicate File Finder Professional, that sells for $39.95. The paid version adds a few different interesting features. You can find “similar” files such as an image whose brightness had been increased or a text file with an extra paragraph. You can filter search results to exclude file types and directories, or candidate files by size or date. The Professional version also lets you export results to CSV or XML and run via command line.

The next time you’re trying to free up some space, try a tool like the Fast Duplicate File Finder to get that wasted space back.

PowerGREP

Finding a file or a set of files may be relatively easy, but what if you want to find a particular string or pattern within those files? This can be tedious, especially if you’re talking about a large volume of files or files dispersed across various machines. One tool that aims to help you find exactly what you’re looking for is PowerGREP from Just Great Software Co. Ltd.

As the name implies, the program gives you Linux or Unix “grep” shell command power for regular expression-based searching. You can search files across a network or local file system with an easy-to-use Windows GUI. There are other Windows-like features as well. You can run it from the command-line like a Windows PowerShell script. And, like grep, you can not only find text within files, but also easily replace text matching your patterns with something else.

PowerGREP takes this a step further, and lets you take a number of different actions besides basic search and replace or listing results. You can opt for the following:

  • Data collection aggregates matched text into a new file
  • Rename files lets you rename files en masse based upon a search/replace pattern
  • Merge files takes all file matches and merges them into one file
  • Split files uses replacement text syntax to specify how the target file should be split into parts

You can script sequences of actions and save those for later reuse. Another nice feature of PowerGREP is the built-in Assistant pane. This gives you concise, useful tips on features and interface elements. Simply click into or hover over them within the application, and you won’t have to use standard help menu options.

PowerGREP has a detailed help guide that includes an explanation of each component as well as a number of samples, references and tutorials on regular expressions. There’s also a script library with a number of useful regular expressions, like finding e-mail addresses, splitting Web log files, replacing HTML attributes or tags, replacing file names, and, of course, replacing or finding text within a file set. Once you have your action or sequence ready to go, you can point it to a target path and execute the search. If you’re doing a replace or edit action, PowerGREP also keeps an “undo” history in case you make a mistake.

You can preview your outcome before you actually commit changes to the files. You can also do a number of sorts, groupings, totals and highlighting to easily verify you have the data you want to change. Additionally, you can see the results you’re trying to grep easily without leaving the program. You can run PowerGREP from the command line, which is great for batch files, external tools and scheduled tasks, such as automated log file splits or “alert-on-error”-type log-file parsing.

PowerGREP runs $159 for a single license. There are discounts as you increase volume. PowerGREP also comes with a three-month money-back guarantee. There’s a limited free trial version also available for download from the product Web site.

Greg Steen
Greg Steen is a technology professional, entrepreneur and enthusiast. He’s always on
the hunt for new tools to help make operations, QA and development easier for the IT professional.

Related Content