performance – Python script that converts Windows Registry Scripts (.reg) into PowerShell scripts (.ps1)

Well this a re-implementation of a PowerShell script that I wrote which does exactly the same thing, and I have ported it into Python.

After a quick Google search I found that there is only one other script that does the same thing, which can be found here: https://reg2ps.azurewebsites.net, though the output of Get remediation script isn’t as beautiful as mine, so my script does something truly special and pioneering.

You can find the PowerShell version here: https://codereview.stackexchange.com/a/261267/234107

This Python script converts a Windows registry file into a PowerShell script that is readily executable, it converts contents of the .reg file into New-PSDrive (if the script modifies a hive that isn’t HKCU or HKLM), New-Item, Set-ItemProperty, Remove-Item and Remove-ItemProperty commands.

It supports all five default registry hives:HKEY_CLASSES_ROOT, HKEY_CURRENT_CONFIG, HKEY_CURRENT_USER, HKEY_LOCAL_MACHINE and HKEY_USERS, and conversion from all six registry data types: REG_SZ, REG_DWORD, REG_QWORD, REG_EXPAND_SZ, REG_MULTI_SZ and REG_BINARY.

For starters, reg_sz and reg_dword are encoded in plain text, reg_sz values are plain ASCII string values, and the datatype for them in Set-ItemProperty cmdlet is String, REG_DWORD values are 32-bit (4 bytes, or two words) binary values encoded in hexadecimal, or 8 hexadecimal bits, their datatype is DWord and their values must be preceded by the hexadecimal header 0x.

String values are indicated by a double quotes after assignment sign, dword values are indicated by =dword:.

REG_QWORD is a 64-bit (8 bytes or four words) binary value, equivalent to 16 hexadecimal bits, it is usually split into chunks of two bits, reversed order and joined by comma.

Qword values are indicated by =hex(b):, "Qword0"=hex(b):8d,02,4e,b8,00,00,00,00 means the value is qword b84e028d, their datatype is qword and their values must be preceded by 0x.

REG_EXPAND_SZ is expandstring, it is indicated by =hex(2):, it is a string of multiple substrings delimited by semicolons, then encoded in ASCII, then 00 (null char) is inserted between every byte, the bytes are delimited by commas, then the whole encoding is broke up into multiple lines using backslashes as line breaks.

Like this:

"ExpandString"=hex(2):53,00,74,00,72,00,69,00,6e,00,67,00,31,00,3b,00,53,00,74,
  00,72,00,69,00,6e,00,67,00,32,00,3b,00,53,00,74,00,72,00,69,00,6e,00,67,00,
  33,00,3b,00,53,00,74,00,72,00,69,00,6e,00,67,00,34,00,00,00

Notice that every second byte is a null byte.

REG_MULTI_SZ is multistring, it is indicated by =hex(7) and very similar to expandstring, but it is a null delimited string of multiple lines with null characters serving as line breaks, so there are null bytes with odd indexes, like this:

"MultiString0"=hex(7):4c,00,69,00,6e,00,65,00,20,00,31,00,00,00,4c,00,69,00,6e,
  00,65,00,20,00,32,00,00,00,4c,00,69,00,6e,00,65,00,20,00,33,00,00,00,4c,00,
  69,00,6e,00,65,00,20,00,34,00,00,00,4c,00,69,00,6e,00,65,00,20,00,35,00,00,
  00,00,00

The odd indexed null characters must be represented by commas in PowerShell, the correct way to modify multistring values is supplying a array of the strings delimited by null chars, the commas should be where the odd indexed nulls are.

REG_BINARY is an arbitrary binary value in any format, indicated by =hex:, encoded in the same way as expandstring and multistring.

Like this:

"Test Binary"=hex:74,68,69,73,20,69,73,20,61,20,74,65,73,74,20,73,74,72,69,6e,
67

That are all the principles of the value conversions.

So here is the code:

import os, re, sys

def reg2ps1(args):
    
    hive = {
        'HKEY_CLASSES_ROOT':   'HKCR:',
        'HKEY_CURRENT_CONFIG': 'HKCC:',
        'HKEY_CURRENT_USER':   'HKCU:',
        'HKEY_LOCAL_MACHINE':  'HKLM:',
        'HKEY_USERS':          'HKU:'
    }
    
    addedpath = ()
    args = rf'{args}'
    
    if os.path.exists(args) and os.path.isfile(args) and args.endswith('.reg'):
        commands = ()
        f = open(args, 'r', encoding='utf-16')
        content = f.read()
        f.close()
        for r in hive.keys():
            if r in content and hive(r) not in ('HKCU:', 'HKLM:'):
                commands.append("New-PSDrive -Name {0} -PSProvider Registry -Root {1}".format(hive(r).replace(':', ''), r))
        filecontent = ()
        for line in content.splitlines():
            if line != '':
                filecontent.append(line.strip())
        
        text = ''
        joinedlines = ()
        
        for line in filecontent:
            if line.endswith('\'):
                text = text + line.replace('\', '')
            else:
                joinedlines.append(text + line)
                text = ''
        
        for joinedline in joinedlines:
            if re.search('(HKEY(.*)+)', joinedline):
                key = re.sub('(-?|)', '', joinedline)
                hivename = key.split('\')(0)
                key = '"' + (key.replace(hivename, hive(hivename))) + '"'
                if joinedline.startswith('(-HKEY'):
                    commands.append(f'Remove-Item -Path {key} -Force -Recurse -ErrorAction SilentlyContinue')
                else:
                    if key not in addedpath:
                        commands.append(f'New-Item -Path {key} -ErrorAction SilentlyContinue | Out-Null')
                        addedpath.append(key)
            elif re.search('"((^"=)+)"=', joinedline):
                delete = False
                name = re.search('("(^"=)+")=', joinedline).groups()(0)
                if '=-' in joinedline:
                    commands.append(f'Remove-ItemProperty -Path {key} -Name {name} -Force')
                    delete = True
                elif '"="' in joinedline:
                    vtype = 'String'
                    value = re.sub('"((^"=)+)"=', '', joinedline)
                elif 'dword' in joinedline:
                    vtype = 'Dword'
                    value = '0x' + re.sub('"((^"=)+)"=dword:', '', joinedline)
                elif 'qword' in joinedline:
                    vtype = 'QWord'
                    value = '0x' + re.sub('"((^"=)+)"=qword:', '', joinedline)
                elif re.search('hex(((2,7,b)))?:', joinedline):
                    value = re.sub('"((^"=)+)"=hex(((2,7,b)))?:', '', joinedline).split(',')
                    hextype = re.search('(hex(((2,7,b)))?)', joinedline).groups()(0)
                    if hextype == 'hex(2)':
                        vtype = 'ExpandString'
                        chars = ()
                        for i in range(0, len(value), 2):
                            if value(i) != '00':
                                chars.append(bytes.fromhex(value(i)).decode('utf-8'))
                        value = '"' + ''.join(chars) + '"'
                    elif hextype == 'hex(7)':
                        vtype = 'MultiString'
                        chars = ()
                        for i in range(0, len(value), 2):
                            if value(i) != '00':
                                chars.append(bytes.fromhex(value(i)).decode('utf-8'))
                            else:
                                chars.append(',')
                        chars0 = (''.join(chars)).split(',')
                        chars.clear()
                        for i in chars0:
                            chars.append('"' + i + '"')
                        value = '@(' + ','.join(chars).replace(',"",""', '') + ')'
                    elif hextype == 'hex(b)':
                        vtype = 'QWord'
                        value.reverse()
                        value = '0x' + ''.join(value).lstrip('0')
                    elif hextype == 'hex':
                        vtype = 'Binary'
                        value1 = ()
                        for i in value:
                            value1.append('0x' + i)
                        value = '((byte())$(' + ','.join(value1) + '))'
                if not delete:
                    if '@=' in joinedline:
                        value = joinedline.replace('@=', '')
                        commands.append(f'Set-ItemProperty -Path {key} -Name "(Default)" -Type "String" -Value {value}')
                    else:
                        commands.append('Set-ItemProperty -Path {0} -Name {1} -Type {2} -Value {3} -Force'.format(key, name, vtype, value))
        filename = args.replace('.reg', '_reg.ps1')
        output = open(filename, 'w+')
        print(*commands, sep='n', file=output)
        output.close()

if __name__ == '__main__':
    reg2ps1(sys.argv(1))

I am really very new to Python and this is the first time I have written something so complex like this in Python, and I know my script is really ugly, but it does get the conversions done right.

Sample input:

Windows Registry Editor Version 5.00

(HKEY_CURRENT_USERTest)
"Test String"="This is a test string"
"Test Binary"=hex:74,68,69,73,20,69,73,20,61,20,74,65,73,74,20,73,74,72,69,6e,
  67
"Dword0"=dword:b5e50577
"Dword1"=dword:b7feec6c
"Qword0"=hex(b):8d,02,4e,b8,00,00,00,00
"Qword2"=hex(b):ff,ff,ff,ff,00,00,00,00
"MultiString0"=hex(7):4c,00,69,00,6e,00,65,00,20,00,31,00,00,00,4c,00,69,00,6e,
  00,65,00,20,00,32,00,00,00,4c,00,69,00,6e,00,65,00,20,00,33,00,00,00,4c,00,
  69,00,6e,00,65,00,20,00,34,00,00,00,4c,00,69,00,6e,00,65,00,20,00,35,00,00,
  00,00,00
"ExpandString"=hex(2):53,00,74,00,72,00,69,00,6e,00,67,00,31,00,3b,00,53,00,74,
  00,72,00,69,00,6e,00,67,00,32,00,3b,00,53,00,74,00,72,00,69,00,6e,00,67,00,
  33,00,3b,00,53,00,74,00,72,00,69,00,6e,00,67,00,34,00,00,00

Sample output:

Registry Editor view:

enter image description here

New-Item -Path "HKCU:Test" -ErrorAction SilentlyContinue | Out-Null
Set-ItemProperty -Path "HKCU:Test" -Name "Test String" -Type String -Value "This is a test string" -Force
Set-ItemProperty -Path "HKCU:Test" -Name "Test Binary" -Type Binary -Value ((byte())$(0x74,0x68,0x69,0x73,0x20,0x69,0x73,0x20,0x61,0x20,0x74,0x65,0x73,0x74,0x20,0x73,0x74,0x72,0x69,0x6e,0x67)) -Force
Set-ItemProperty -Path "HKCU:Test" -Name "Dword0" -Type Dword -Value 0xb5e50577 -Force
Set-ItemProperty -Path "HKCU:Test" -Name "Dword1" -Type Dword -Value 0xb7feec6c -Force
Set-ItemProperty -Path "HKCU:Test" -Name "Qword0" -Type QWord -Value 0xb84e028d -Force
Set-ItemProperty -Path "HKCU:Test" -Name "Qword2" -Type QWord -Value 0xffffffff -Force
Set-ItemProperty -Path "HKCU:Test" -Name "MultiString0" -Type MultiString -Value @("Line 1","Line 2","Line 3","Line 4","Line 5") -Force
Set-ItemProperty -Path "HKCU:Test" -Name "ExpandString" -Type ExpandString -Value "String1;String2;String3;String4" -Force

Please help me simplify and beautify my code, so that it does the same conversions correctly with less code and better format, thank you!