PowerShell ForEach hash comparison between directories - arrays

I'm working on a script to compare two directories. There are two main things I want the script to show in the output--which files exist on one directory but not the other, and which files appear in both directories but have differences in them. Matching files don't need to show up.
I got some advice before on how to achieve this, but since I'm still pretty new to PS I'm having trouble executing it. What I'm trying to do is this:
I have Path #1. For each file in that path, I want to test for their existence on Path #2.
If the file exists in both paths, do a hash comparison between them. If there are differences, add the files to List A.
If the file appears in Path 1 but not Path 2, put them in List B.
This isn't as important, but would it also be possible to find files that exist in Path 2 but not Path 1? For work purposes that probably won't matter, but it will still be nice just in case.
Take the output and format it so that it can show something like: "The following files exist in Path 1 and not Path 2," and "The following files exist in both paths but have differences."
Basically, I don't just want an info dump of files to be the output and people end up having to puzzle through it. And like I said, I think the advice I received on how to do it will be good, I'm just having trouble making it work.
Here's the code I have so far:
$Source = #(Get-ChildItem -Recurse -Path \\SERVER\D$\PSTest)
foreach ($file in $Source){
If ($Target = Test-Path #(Get-ChildItem -Recurse -Path \\SERVER\D$\PSTest))
{
$HashResult = (Compare-Object -ReferenceObject $file -DifferenceObject
$Target -Property hash -PassThru).Path
}
else {
$Missing += $file
}
}
Write-Host 'These files have differences.' -ForegroundColor Green
$HashResult
Write-Host 'These files are missing from the target path.' -ForegroundColor
Green
$Missing
When I run that, I don't get any results (other than the text output). Where am I going wrong with this?

Made a few assumptions about the file names and their uniqueness down through the various depths of the source/target folders:
$SourceDir = "C:\\temptest";
$DestDir = "D:\\temptest";
$SourceFiles = #(Get-ChildItem -Recurse -Path $SourceDir);
$DestFiles = #(Get-ChildItem -Recurse -Path $DestDir);
$SourceFileNames = $SourceFiles | % { $_.Name };
$DestFileNames = $DestFiles | % { $_.Name };
$MissingFromDestination = #();
$MissingFromSource = #();
$DifferentFiles = #();
foreach($f in $SourceFiles) {
if (!$DestFileNames.Contains($f.Name)) {
$MissingFromDestination += $f;
} else {
$t = $DestFiles | Where { $_.Name -eq $f.Name };
if ((Get-FileHash $f.FullName).hash -ne (Get-FileHash $t.FullName).hash) {
$DifferentFiles += $f;
}
}
}
foreach($f in $DestFiles) {
if (!$SourceFileNames.Contains($f.Name)) {
$MissingFromSource += $f;
}
}
"
Missing from Destination: "
$MissingFromDestination | % { $_.FullName };
"
Missing from Source: "
$MissingFromSource | % { $_.FullName };
"
Source is Different: "
$DifferentFiles | % { $_.FullName };
This is a bit naive in its approach insofar as it is really only checking file names and ignoring subfolder tree structures. But, hopefully, it will give you enough of a leaping off point.

Related

Array not creating correctly in PowerShell

I have a script where I get all of the folders in d:\folder\*\*\ where the name is -like "*\Log". I then split the folder paths apart to run through wmi to get the corresponding services. After that I'm wanting to split apart the PathName property from $Services so I get everything before the \xxxxx.exe and add \log to the end of the result. Eventually I'll then use those paths to do some compression and archiving of files via a gci.
For whatever reason when I run the script below I the previous loops $LocalLogVar without "log" appended and the current loops LocalLogVar with log appended. I'm sure I'm doing something wrong that's blatantly obvious to somebody out there. If somebody could point me in the right direction on this it'd be much appreciated! I also apologize for the word vomit here, I've been looking at this script all day and my brain's pretty much used up.
A couple of notes:
The number of words in the paths vary which is why I can't manually do $LocalLogVar = "$Var1\$Var2\$Var3\Log"
If I don't have the [array] in front of $LogFolders the object type becomes a string and I get the previous loop's $LocalLogVar without "log" appended combined with the current loop's $LocalLogVar
I tried doing [collections.arraylist]$LogFolders=#() with no success
c:\folder is a shortcut to d:\folder, which is why there's c:\folder\xxx and d:\folder\xxx in the list below
SplitCount is -1 because I don't want the .exe from the path, I just want the folder structure
The naming convention for the string before .exe varies so I can't use an enumerated counter.
Example of first bullet:
word7-word8 #This is the previous loop's $LocalLogVar w/o "log" appended
C:\folder\word5\word6\word9-word8\log #This is the current loop's $LocalLogVar w/ "log" appended.
Example of the second bullet:
word7-word8C:\folder\word5\word6\word9-word8\log
What I should be getting:
D:\folder\word-anotherword\word7-word8\log
D:\folder\word-anotherword\word9-word8\log
C:\folder\word1\word7-word8\log
C:\folder\word1\word9-word8\log
C:\folder\word2\word7-word8\log
C:\folder\word2\word9-word8\log
D:\folder\word2\word10-word11\log
D:\folder\word2\word12-word8\log
C:\folder\word3\word7-word8\log
C:\folder\word3\word9-word8\log
D:\folder\word4\word7-word8\log
C:\folder\word4\word9-word8\log
C:\folder\word5\word6\word7-word8\log
C:\folder\word5\word6\word9-word8\log
C:\folder\word5\word6\word7-word8\log
C:\folder\word5\word6\word9-word8\log
$Folders = Get-ChildItem D:\folder\*\*\ -Directory -Recurse -Verbose `
| Where-Object { $_.fullname -like "*\Log" }
$2 = #()
$LogFolders = #()
foreach ($folder in $folders) {
$ServName = $folder.fullname.split('\')[2]
$ServType = $folder.fullname.split('\')[3]
$ServNameCheck = "*$($ServName.replace('-',' '))*"
$ServTypeCheck = "*$($ServType.replace('-',' '))*"
$PathName = Get-WmiObject -ClassName Win32_Service `
| Where-Object { $_.caption -like "$ServNameCheck" -and $_.caption -like "$ServTypeCheck" } `
| Select-Object Name, Caption, #{n = 'PathName'; e = { ($_.PathName).trim('"') } }
$2 += $PathName
}
$Services = $2 | Sort-Object pathname | Get-Unique -AsString
foreach ($ServPath in $services.pathname) {
$LocalLogVar = #()
if (Get-Variable `
| Where-Object { $_.name -match "^Split([0-9]|10)$" }) {
Get-Variable `
| Where-Object { $_.name -match "^Split([0-9]|10)$" } | Remove-Variable -Force
}
[int]$SplitCount = $ServPath.split('\').count
[int]$SplitCountCheck = $SplitCount - 1
$x = 0
do {
New-Variable -Name "Split$x" -Value "$($ServPath.split('\')[$x])"
$RegEx = "Split$x"
$LogFolderName = Get-Variable | Where-Object { $_.name -match $RegEx } | Select-Object value
[string]$LogFolders = $LogFolderName.value.ToString()
$LocalLogVar += $LogFolders + '\'
$x++
} until ($x -eq $SplitCountCheck)
$LocalLogVar = $LocalLogVar
$LocalLogVar = $LocalLogVar + "log"
[array]$LogFolders += $LocalLogVar
}
Wow, so that's a script. Kind of hard to follow, since some of it seems needlessly complex. I'm not sure if it will accomplish what you're looking for, but that's because you were super vague with your folder descriptions. Do the folders always start like this:
D:\folder<Service Short Name><Service Long Name>...\logs
If not you could be in trouble. The last four items on your example list of what you expect to see don't look like they are like that. I think the way your folders are laid out are like this:
D:\folder...<Service Short Name><Service Long Name>\logs
The difference being where the extra folders are located. If they're before the service like I think they are your script will miss things.
Now, on to getting your list that you want. What I see from looking at your script is that you get a folder list for all folders under D:\folder\*\*\ named 'log'. Then you split out the 3rd and 4th folders to get a service's short name, and long name respectively. Then one by one you pull a list of all services from WMI, and filter for just the service that matches the name and caption (short name, and long name) referred to by the folders. After that you make sure you only have one listing of any given service.
Regarding this first part of the script, you can make it faster by letting the file system provider filter things for you. Instead of pulling a folder list of everything and then filtering for paths that end in '\log', you should use the -filter parameter of the Get-ChildItem cmdlet like this:
$Folders = Get-ChildItem C:\temp\*\*\ -Directory -Recurse -Verbose -Filter 'log'
Then you should query WMI one time, save the result, then pick and choose from there based on your folders. Something like:
[array]$2 = foreach ($folder in $folders) {
$ServName,$ServType = $folder.fullname.split('\')[2,3] -replace '-',' '
$PathName = $AllServices |
Where-Object { $_.caption -like "*$ServName*" -and $_.caption -like "*$ServType*" } |
Select-Object Name, Caption, #{n = 'PathName'; e = { $_.PathName -replace '^(\w\S+) .*','$1' -replace '^([''"])([^\1]+)\1.*','$2' } }
}
$Services = $2 | Sort-Object pathname | Get-Unique -AsString
I did a little regex magic to clean up the pathname instead of just .trim('"') since this gets rid of parameters in the service execution, and cleans paths that are enclosed in single quotes not just double quotes. If what you have works for you feel free to keep it, but this is a little more capable. It may be worth noting that Get-Unique is case sensitive, so 'C:\folder\word3\word9-word8' and 'C:\folder\word3\word9-Word8' are different. You might want to do a .ToUpper() on your paths before you look for unique ones.
Once you have your array of services you loop through them, splitting the file path, reassembling it, and finally adding 'log' to the end of it. That was your way to remove the executable from the path. There's a cmdlet that was designed to do just that: split-path. Use that with Join-Path and that whole last loop gets much simpler:
[array]$LogFolders = foreach ($ServPath in $services.pathname) {
Join-Path (Split-Path $ServPath) 'log'
}
Lastly, try not to use +=, since PowerShell has to rebuild the whole array each time you do that. You'll notice I moved the $Variable = bit outside the loop in places that you do that.

Powershell search through folders and sub folder for files with ".rtf" extension

I am trying to get all files w/in a directory that have the extension ".rtf". I have a working script, but it takes a while, as there is a foreach loop w/in a foreach loop. Is there a faster way to handle this? The goal of the script is to get all files w/in a directory ending in .rtf and use MSWord to Open the file and save it as a ".DOC". The conversion functionality works fine. The issue is with the length of time to search through all of the folders.
Function Convert-Dir($path)
{
$subFolders = get-childitem $path -Recurse | Where-Object {$_.PSIsContainer -eq $True}
if($subFolders)
{
foreach($folder in $subFolders)
{
if($folder.PSisContainer)
{
$Files=Get-ChildItem $folder.fullname -Filter "*.rtf"
$Word=New-Object -ComObject WORD.APPLICATION
if($Files)
{
foreach ($File in $Files)
{
$Doc=$Word.Documents.Open($File.fullname)
$Name=($Doc.name).replace("rtf","doc")
if (Test-Path $Name)
{
} else
{
# Use WORD
$fullName = ($Doc.path + "\" + "Converted_" + $Name)
$Doc.saveas([ref] $fullName, [ref] 0)
$Doc.close()
$fileToRemove = $File.fullName
Remove-Item $fileToRemove
$Word.Quit()
}
}
}
}
}
}
}
I guess the performance is lost by creating a lot of word-instances by calling a word-process in each subfolder. You should should use only one instance of word all the time. Just move the line $Word=New-Object -ComObject WORD.APPLICATION to the top of your function and the line $word.quit() to the very end.

Recursing Through Multiple Text Files with References

I have hundreds of text files in a folder which can often reference each other, and go serveral levels deep. Not sure if I am explaining this well, so I will explain with an example.
Let's say folder "A" contains 500 .txt files. The first one could be called A.txt and somewhere in there it mentions B.txt, which in turn mentions C.txt and so on. I believe the number of levels down is no more than 10.
Now, I want to find a certain text strings which relate to A.txt by programmitically going through that file, then if it sees references to other .txt files go through them as well and so on. The resulting output would be something like A_out.txt which contains everything it found based on a regex.
I started out with this using Powershell but am now a little stuck:
$files = Get-ChildItem "C:\TEST\" -Filter *.txt
$regex = ‘PCB.*;’
for ($i=0; $i -lt $files.Count; $i++) {
$infile = $files[$i].FullName
$outfile = $files[$i].BaseName + "_out.txt"
select-string $infile -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } > $outfile
}
It goes through every .txt file and outputs everything that matches the PCB.*; expression to its corresponding _out.txt file.
I have absolutely no idea how to now expand this to include references to the other files. I'm not even sure if this is possible in PowerShell or whether I need to use another language to achieve what I want.
I could get some office monkey's to do all this manually but if this is relatively simple to code then it would save us a lot of time. Any help would be greatly appreciated :)
/Edit
Whilst running through this in my head, I thought I could build up an array for every time another one of the files is mentioned, and then repeat the process for those as well. However, back to my original problem, I have no idea how I would go about this.
/Edit 2:
Sorry, had been away for a few days and am only just picking this up. I have been using what I've learnt from this question and a few others to come up with the following:
function Get-FileReference
{
Param($FileName, $OutputFileName='')
if ($OutputFileName -eq '')
{
Get-FileReference $FileName ($FileName -replace '.xml$', '_out.xml')
}
else
{
Select-String $FileName -Pattern 'BusinessObject.[^"rns][w.]*' -AllMatches | % { $_.Matches } | % { $_.Value } | Add-Content $OutputFileName
Set-Location C:\TEST
$References = (Select-String -Pattern '(?<=resid=")d+' -AllMatches -path $FileName | % { $_.Matches } | % { $_.Value })
Write "SC References: $References" | Out-File OUTPUT.txt -Append
foreach ($Ref in $References)
{
$count
Write "$count" | Out-File OUTPUT.txt -Append
$count++
Write "SC Reference: $Ref" | Out-File OUTPUT.txt -Append
$xml = [xml](Get-Content 'C:\TEST\package.xml')
$res = $xml.SelectSingleNode('//res[#id = $Ref]/child::resver[last()]')
$resource = $res.id + ".xml"
Write "File to Check $resource" | Out-File OUTPUT.txt -Append
Get-FileReference $resource $OutputFileName
}
}
}
$files = gci "C:\TEST" *.xml
ForEach ($file in $files) {
Get-FileReference $file.FullName
}
Following my original question, I realised that this was a little bit more extensive than I originally thought and therefore had to tinker.
These are the noteable points:
All the parent files are .xml and code that matches on
"BusinessObject" etc works as expected.
The references to other
files are not simply .txt but require a pattern match of
'(?<=resid=")d+'.
This pattern match needs to be cross referenced with another file package.xml and based on the value
it returns, the file it next needs to look into is [newname].xml
As before, those child .xml files could reference some of the
other .xml files
The code I have pasted above seems to be getting stuck in endless loops (hence why I have debugging in there at the moment) and it is not liking the use of $Ref in:
$res = $xml.SelectSingleNode('//res[#id = $Ref]/child::resver[last()]')
That results in the following error:
Exception calling "SelectSingleNode" with "1" argument(s): "Namespace Manager or XsltContext needed. This query has a prefix, variable, or user-defined function."
Since there could be hundreds of files it dies when it gets over 1000+.
A recursive function which tries to do what you want.
function Get-FileReference
{
Param($FileName, $OutputFileName='')
if ($OutputFileName -eq '')
{
Get-FileReference $FileName ($FileName -replace '\.txt$', '_out.txt')
}
else
{
Select-String -Pattern 'PCB.*;' -Path $FileName -AllMatches | Add-Content $OutputFileName
$References = (Select-String -Pattern '^.*\.txt' -AllMatches -path $FileName).Matches.Value
foreach ($Ref in $References)
{
Get-FileReference $Ref $OutputFileName
}
}
}
$files = gci *.txt
ForEach ($file in $files) { Get-FileReference $file.FullName }
It takes two parameters - a filename and an output filename. If called without an output filename, it assumes it's at the top of a new recursion tree and generates an output filename to append to.
If called with an output filename (i.e. by itself) it searches for PCB patterns, appends to the output, then calls itself on any file references, with the same output filename.
Assuming that file references are lines on their own with no spaces xyz.txt.

Comparing files within folders and moving files not found in one folder to a different location

I am working on folders that contain many types and sizes of files within them. What I want to do is move files that are not contained in one folder into a new folder. I have embedded a picture link that helps illustrate what I am aiming to do.
I would like test123.pdf to be moved to a new location because it's not contained within the other folder. Below I have some code that simply compares the contents of each folder and outputs which file is out of place. I have been researching some things online, but have come up empty. Can anyone help me proceed?
Disclaimer: I know the path is wrong, but I can't show it for security reasons.
$Folder1 = Get-ChildItem -Recurse -path "Enter Path here"
$Folder2 = Get-ChildItem -Recurse -path "Enter the Path here"
Compare-Object -ReferenceObject $Folder1 -DifferenceObject $Folder2
It sounds like you want to compare the contents of two folders, ID the file(s) that are not present in both folders, and then move them to a third folder. To accomplish this, you can define your 2 paths in variables, compare the contents of the folders, grab the full path names of the different items, and then move them to the new destination.
$path1 = yourfirstpath
$path2 = yoursecondpathforcomparing
$path3 = yourdestinationpath
diff (ls $path1 -recurse) (ls $path2 -recurse) | ForEach {$_.InputObject.FullName} | Move-Item -Destination $path3
diff = Compare-Object , ls = Get-ChildItem
This will do the work, any file that is not in both Folder1 & Folder2 will be moved to Folder 3
$Folder1 = 'C:\folder1'
$Folder2 = 'C:\folder2'
$Folder3 = 'C:\folder3'
foreach ($file in Get-ChildItem -Recurse $Folder1)
{
if (-not(Test-Path "$Folder2\$file"))
{
Move-Item $file.FullName $Folder3
}
}
This will compare by file name, and maintain the directory structure the extra files were found in, and handle the recursion properly. The output of the code demonstrates the directory structure I tested with.
# be explict here to ensure everything is lowercase
# so when this is cut and pasted, it does not break
# when you enter the real path
$d1 = "d:\test\one".ToLower()
$d2 = "d:\test\two".ToLower()
$f1 = gci -Recurse $d1 -File | % {$_.FullName.ToLower()}
$f2 = gci -Recurse $d2 -File | % {$_.FullName.ToLower()}
"f1"
$f1
"`nf2"
$f2
$same = #()
$extra = #()
foreach ($f in $f2)
{
$f2tof1path = $f.Replace($d2, $d1)
if ($f1.Contains($f2tof1path) -eq $false)
{
$extra += $f
}
else {
$same += $f
}
}
"`nSame"
$same
"`nExtra"
$extra
"`nMOVE"
$folder3 = "d:\test\three"
foreach ($f in $extra)
{
# move files somewhere, create dir if not exists
$dest = $f.Replace($d2,$folder3)
$destdir = $(Split-Path -Parent -Path $dest)
if (!(Test-Path $destdir))
{
# remove quotes to do new-item, keep them to show what it will do
"New-Item -ItemType Directory -Force -Path $destdir"
}
# remove quotes to do move-item, keep them to show what it will do
"Move-Item $f $dest"
}
Output
f1
d:\test\one\1.txt
d:\test\one\2.txt
d:\test\one\sub\1.txt
d:\test\one\sub\2.txt
f2
d:\test\two\1.txt
d:\test\two\2.txt
d:\test\two\3.txt
d:\test\two\sub\1.txt
d:\test\two\sub\2.txt
d:\test\two\sub\3.txt
Same
d:\test\two\1.txt
d:\test\two\2.txt
d:\test\two\sub\1.txt
d:\test\two\sub\2.txt
Extra
d:\test\two\3.txt
d:\test\two\sub\3.txt
MOVE
New-Item -ItemType Directory -Force -Path d:\test\three
Move-Item d:\test\two\3.txt d:\test\three\3.txt
New-Item -ItemType Directory -Force -Path d:\test\three\sub
Move-Item d:\test\two\sub\3.txt d:\test\three\sub\3.txt
If this does not solve your problem, it should get you 99% there. Play around with the code, have fun, and good luck.

Comparing folders and content with PowerShell

I have two different folders with xml files. One folder (folder2) contains updated and new xml files compared to the other (folder1). I need to know which files in folder2 are new/updated compared to folder1 and copy them to a third folder (folder3). What's the best way to accomplish this in PowerShell?
OK, I'm not going to code the whole thing for you (what's the fun in that?) but I'll get you started.
First, there are two ways to do the content comparison. The lazy/mostly right way, which is comparing the length of the files; and the accurate but more involved way, which is comparing a hash of the contents of each file.
For simplicity sake, let's do the easy way and compare file size.
Basically, you want two objects that represent the source and target folders:
$Folder1 = Get-childitem "C:\Folder1"
$Folder2 = Get-childitem "C:\Folder2"
Then you can use Compare-Object to see which items are different...
Compare-Object $Folder1 $Folder2 -Property Name, Length
which will list for you everything that is different by comparing only name and length of the file objects in each collection.
You can pipe that to a Where-Object filter to pick stuff that is different on the left side...
Compare-Object $Folder1 $Folder2 -Property Name, Length | Where-Object {$_.SideIndicator -eq "<="}
And then pipe that to a ForEach-Object to copy where you want:
Compare-Object $Folder1 $Folder2 -Property Name, Length | Where-Object {$_.SideIndicator -eq "<="} | ForEach-Object {
Copy-Item "C:\Folder1\$($_.name)" -Destination "C:\Folder3" -Force
}
Recursive Directory Diff Using MD5 Hashing (Compares Content)
Here is a pure PowerShell v3+ recursive file diff (no dependencies) that calculates MD5 hash for each directories file contents (left/right). Can optionally export CSV's along with a summary text file. Default outputs results to stdout. Can either drop the rdiff.ps1 file into your path or copy the contents into your script.
USAGE: rdiff path/to/left,path/to/right [-s path/to/summary/dir]
Here is the gist. Recommended to use version from gist as it may have additional features over time. Feel free to send pull requests.
#########################################################################
### USAGE: rdiff path/to/left,path/to/right [-s path/to/summary/dir] ###
### ADD LOCATION OF THIS SCRIPT TO PATH ###
#########################################################################
[CmdletBinding()]
param (
[parameter(HelpMessage="Stores the execution working directory.")]
[string]$ExecutionDirectory=$PWD,
[parameter(Position=0,HelpMessage="Compare two directories recursively for differences.")]
[alias("c")]
[string[]]$Compare,
[parameter(HelpMessage="Export a summary to path.")]
[alias("s")]
[string]$ExportSummary
)
### FUNCTION DEFINITIONS ###
# SETS WORKING DIRECTORY FOR .NET #
function SetWorkDir($PathName, $TestPath) {
$AbsPath = NormalizePath $PathName $TestPath
Set-Location $AbsPath
[System.IO.Directory]::SetCurrentDirectory($AbsPath)
}
# RESTORES THE EXECUTION WORKING DIRECTORY AND EXITS #
function SafeExit() {
SetWorkDir /path/to/execution/directory $ExecutionDirectory
Exit
}
function Print {
[CmdletBinding()]
param (
[parameter(Mandatory=$TRUE,Position=0,HelpMessage="Message to print.")]
[string]$Message,
[parameter(HelpMessage="Specifies a success.")]
[alias("s")]
[switch]$SuccessFlag,
[parameter(HelpMessage="Specifies a warning.")]
[alias("w")]
[switch]$WarningFlag,
[parameter(HelpMessage="Specifies an error.")]
[alias("e")]
[switch]$ErrorFlag,
[parameter(HelpMessage="Specifies a fatal error.")]
[alias("f")]
[switch]$FatalFlag,
[parameter(HelpMessage="Specifies a info message.")]
[alias("i")]
[switch]$InfoFlag = !$SuccessFlag -and !$WarningFlag -and !$ErrorFlag -and !$FatalFlag,
[parameter(HelpMessage="Specifies blank lines to print before.")]
[alias("b")]
[int]$LinesBefore=0,
[parameter(HelpMessage="Specifies blank lines to print after.")]
[alias("a")]
[int]$LinesAfter=0,
[parameter(HelpMessage="Specifies if program should exit.")]
[alias("x")]
[switch]$ExitAfter
)
PROCESS {
if($LinesBefore -ne 0) {
foreach($i in 0..$LinesBefore) { Write-Host "" }
}
if($InfoFlag) { Write-Host "$Message" }
if($SuccessFlag) { Write-Host "$Message" -ForegroundColor "Green" }
if($WarningFlag) { Write-Host "$Message" -ForegroundColor "Orange" }
if($ErrorFlag) { Write-Host "$Message" -ForegroundColor "Red" }
if($FatalFlag) { Write-Host "$Message" -ForegroundColor "Red" -BackgroundColor "Black" }
if($LinesAfter -ne 0) {
foreach($i in 0..$LinesAfter) { Write-Host "" }
}
if($ExitAfter) { SafeExit }
}
}
# VALIDATES STRING MIGHT BE A PATH #
function ValidatePath($PathName, $TestPath) {
If([string]::IsNullOrWhiteSpace($TestPath)) {
Print -x -f "$PathName is not a path"
}
}
# NORMALIZES RELATIVE OR ABSOLUTE PATH TO ABSOLUTE PATH #
function NormalizePath($PathName, $TestPath) {
ValidatePath "$PathName" "$TestPath"
$TestPath = [System.IO.Path]::Combine((pwd).Path, $TestPath)
$NormalizedPath = [System.IO.Path]::GetFullPath($TestPath)
return $NormalizedPath
}
# VALIDATES STRING MIGHT BE A PATH AND RETURNS ABSOLUTE PATH #
function ResolvePath($PathName, $TestPath) {
ValidatePath "$PathName" "$TestPath"
$ResolvedPath = NormalizePath $PathName $TestPath
return $ResolvedPath
}
# VALIDATES STRING RESOLVES TO A PATH AND RETURNS ABSOLUTE PATH #
function RequirePath($PathName, $TestPath, $PathType) {
ValidatePath $PathName $TestPath
If(!(Test-Path $TestPath -PathType $PathType)) {
Print -x -f "$PathName ($TestPath) does not exist as a $PathType"
}
$ResolvedPath = Resolve-Path $TestPath
return $ResolvedPath
}
# Like mkdir -p -> creates a directory recursively if it doesn't exist #
function MakeDirP {
[CmdletBinding()]
param (
[parameter(Mandatory=$TRUE,Position=0,HelpMessage="Path create.")]
[string]$Path
)
PROCESS {
New-Item -path $Path -itemtype Directory -force | Out-Null
}
}
# GETS ALL FILES IN A PATH RECURSIVELY #
function GetFiles {
[CmdletBinding()]
param (
[parameter(Mandatory=$TRUE,Position=0,HelpMessage="Path to get files for.")]
[string]$Path
)
PROCESS {
ls $Path -r | where { !$_.PSIsContainer }
}
}
# GETS ALL FILES WITH CALCULATED HASH PROPERTY RELATIVE TO A ROOT DIRECTORY RECURSIVELY #
# RETURNS LIST OF #{RelativePath, Hash, FullName}
function GetFilesWithHash {
[CmdletBinding()]
param (
[parameter(Mandatory=$TRUE,Position=0,HelpMessage="Path to get directories for.")]
[string]$Path,
[parameter(HelpMessage="The hash algorithm to use.")]
[string]$Algorithm="MD5"
)
PROCESS {
$OriginalPath = $PWD
SetWorkDir path/to/diff $Path
GetFiles $Path | select #{N="RelativePath";E={$_.FullName | Resolve-Path -Relative}},
#{N="Hash";E={(Get-FileHash $_.FullName -Algorithm $Algorithm | select Hash).Hash}},
FullName
SetWorkDir path/to/original $OriginalPath
}
}
# COMPARE TWO DIRECTORIES RECURSIVELY #
# RETURNS LIST OF #{RelativePath, Hash, FullName}
function DiffDirectories {
[CmdletBinding()]
param (
[parameter(Mandatory=$TRUE,Position=0,HelpMessage="Directory to compare left.")]
[alias("l")]
[string]$LeftPath,
[parameter(Mandatory=$TRUE,Position=1,HelpMessage="Directory to compare right.")]
[alias("r")]
[string]$RightPath
)
PROCESS {
$LeftHash = GetFilesWithHash $LeftPath
$RightHash = GetFilesWithHash $RightPath
diff -ReferenceObject $LeftHash -DifferenceObject $RightHash -Property RelativePath,Hash
}
}
### END FUNCTION DEFINITIONS ###
### PROGRAM LOGIC ###
if($Compare.length -ne 2) {
Print -x "Compare requires passing exactly 2 path parameters separated by comma, you passed $($Compare.length)." -f
}
Print "Comparing $($Compare[0]) to $($Compare[1])..." -a 1
$LeftPath = RequirePath path/to/left $Compare[0] container
$RightPath = RequirePath path/to/right $Compare[1] container
$Diff = DiffDirectories $LeftPath $RightPath
$LeftDiff = $Diff | where {$_.SideIndicator -eq "<="} | select RelativePath,Hash
$RightDiff = $Diff | where {$_.SideIndicator -eq "=>"} | select RelativePath,Hash
if($ExportSummary) {
$ExportSummary = ResolvePath path/to/summary/dir $ExportSummary
MakeDirP $ExportSummary
$SummaryPath = Join-Path $ExportSummary summary.txt
$LeftCsvPath = Join-Path $ExportSummary left.csv
$RightCsvPath = Join-Path $ExportSummary right.csv
$LeftMeasure = $LeftDiff | measure
$RightMeasure = $RightDiff | measure
"== DIFF SUMMARY ==" > $SummaryPath
"" >> $SummaryPath
"-- DIRECTORIES --" >> $SummaryPath
"`tLEFT -> $LeftPath" >> $SummaryPath
"`tRIGHT -> $RightPath" >> $SummaryPath
"" >> $SummaryPath
"-- DIFF COUNT --" >> $SummaryPath
"`tLEFT -> $($LeftMeasure.Count)" >> $SummaryPath
"`tRIGHT -> $($RightMeasure.Count)" >> $SummaryPath
"" >> $SummaryPath
$Diff | Format-Table >> $SummaryPath
$LeftDiff | Export-Csv $LeftCsvPath -f
$RightDiff | Export-Csv $RightCsvPath -f
}
$Diff
SafeExit
Further to #JNK's answer, you might want to ensure that you are always working with files rather than the less-intuitive output from Compare-Object. You just need to use the -PassThru switch...
$Folder1 = Get-ChildItem "C:\Folder1"
$Folder2 = Get-ChildItem "C:\Folder2"
$Folder2 = "C:\Folder3\"
# Get all differences, i.e. from both "sides"
$AllDiffs = Compare-Object $Folder1 $Folder2 -Property Name,Length -PassThru
# Filter for new/updated files from $Folder2
$Changes = $AllDiffs | Where-Object {$_.Directory.Fullname -eq $Folder2}
# Copy to $Folder3
$Changes | Copy-Item -Destination $Folder3
This at least means you don't have to worry about which way the SideIndicator arrow points!
Also, bear in mind that you might want to compare on LastWriteTime as well.
Sub-folders
Looping through the sub-folders recursively is a little more complicated as you probably will need to strip off the respective root folder paths from the FullName field before comparing lists.
You could do this by adding a new ScriptProperty to your Folder1 and Folder2 lists:
$Folder1 | Add-Member -MemberType ScriptProperty -Name "RelativePath" `
-Value {$this.FullName -replace [Regex]::Escape("C:\Folder1"),""}
$Folder2 | Add-Member -MemberType ScriptProperty -Name "RelativePath" `
-Value {$this.FullName -replace [Regex]::Escape("C:\Folder2"),""}
You should then be able to use RelativePath as a property when comparing the two objects and also use that to join on to "C:\Folder3" when copying to keep the folder structure in place.
Here's an approach which will find files which are missing or differ in content.
First, a quick-and-dirty one-liner (see caveat below).
dir -r | rvpa -Relative |%{ if (Test-Path $right\$_) { if (Test-Path -Type Leaf $_) { if ( diff (cat $_) (cat $right\$_ ) ) { $_ } } } else { $_ } }
Run the above in one of the directories, with $right set to (or replaced with) the path to the other directory. Things missing from $right, or which differ in content, will be reported. No output means no differences found. CAVEAT: Things existing in $right but missing from the left will not be found/reported.
This doesn't bother calculating hashes; it just compares the file contents directly. Hashing makes sense when you want to reference something in another context (later date, on another machine, etc.), but when we're comparing things directly, it adds nothing but overhead. (It's also theoretically possible for two files to have the same hash, although that's basically impossible to happen by accident. Deliberate attack, on the other hand...)
Here's a more proper script, which handles more corner cases and errors.
[CmdletBinding()]
Param(
[Parameter(Mandatory=$true,Position=0)][string]$Left,
[Parameter(Mandatory=$True,Position=1)][string]$Right
)
# throw errors on undefined variables
Set-StrictMode -Version 1
# stop immediately on error
$ErrorActionPreference = [System.Management.Automation.ActionPreference]::Stop
# init counters
$Items = $MissingRight = $MissingLeft = $Contentdiff = 0
# make sure the given parameters are valid paths
$left = Resolve-Path $left
$right = Resolve-Path $right
# make sure the given parameters are directories
if (-Not (Test-Path -Type Container $left)) { throw "not a container: $left" }
if (-Not (Test-Path -Type Container $right)) { throw "not a container: $right" }
# Starting from $left as relative root, walk the tree and compare to $right.
Push-Location $left
try {
Get-ChildItem -Recurse | Resolve-Path -Relative | ForEach-Object {
$rel = $_
$Items++
# make sure counterpart exists on the other side
if (-not (Test-Path $right\$rel)) {
Write-Output "missing from right: $rel"
$MissingRight++
return
}
# compare contents for files (directories just have to exist)
if (Test-Path -Type Leaf $rel) {
if ( Compare-Object (Get-Content $left\$rel) (Get-Content $right\$rel) ) {
Write-Output "content differs : $rel"
$ContentDiff++
}
}
}
}
finally {
Pop-Location
}
# Check items in $right for counterparts in $left.
# Something missing from $left of course won't be found when walking $left.
# Don't need to check content again here.
Push-Location $right
try {
Get-ChildItem -Recurse | Resolve-Path -Relative | ForEach-Object {
$rel = $_
if (-not (Test-Path $left\$rel)) {
Write-Output "missing from left : $rel"
$MissingLeft++
return
}
}
}
finally {
Pop-Location
}
Write-Verbose "$Items items, $ContentDiff differed, $MissingLeft missing from left, $MissingRight from right"
Handy version using script parameter
Simple file-level comparasion
Call it like PS > .\DirDiff.ps1 -a .\Old\ -b .\New\
Param(
[string]$a,
[string]$b
)
$fsa = Get-ChildItem -Recurse -path $a
$fsb = Get-ChildItem -Recurse -path $b
Compare-Object -Referenceobject $fsa -DifferenceObject $fsb
Possible output:
InputObject SideIndicator
----------- -------------
appsettings.Development.json <=
appsettings.Testing.json <=
Server.pdb =>
ServerClientLibrary.pdb =>
Do this:
compare (Get-ChildItem D:\MyFolder\NewFolder) (Get-ChildItem \\RemoteServer\MyFolder\NewFolder)
And even recursively:
compare (Get-ChildItem -r D:\MyFolder\NewFolder) (Get-ChildItem -r \\RemoteServer\MyFolder\NewFolder)
and is even hard to forget :)
gci -path 'C:\Folder' -recurse |where{$_.PSIsContainer}
-recurse will explore all subtrees below the root path given and the .PSIsContainer property is the one you want to test for to grab all folders only. You can use where{!$_.PSIsContainer} for just files.

Resources