Quantcast
Channel: PowerShell.com – PowerShell Scripts, Tips, Forums, and Resources
Viewing all articles
Browse latest Browse all 6937

SQL bulk copy from csv file - Encoding

$
0
0

Hi Experts

This is the first time I am creating a PowerShell script and it is almost working. I just have some problems with the actual bulk import to SQL encoding from the text file since it replaces special characters with a question mark. I have set the encoding when creating the csv file but that does not seem to reflect on the actual bulk import. I have tried difference scenarios with the encoding part but I cannot find the proper solution for that.

To shortly outline what the script does:

  • Connect to Active Directory fetching all user - but excluding users in specific OU's
  • Export all users to a csv in unicode encoding
  • Strip double quote text identifiers (if there is another way of handling that it will be much appreciated)
  • Clear all records temporary SQL table
  • Import records from csv file to temporary SQL table (this is where the encoding is wrong)
  • Update existing records in another table based on the records in the temporary table and insert new record if not found.

The script looks as the following (any suggestions for optimizing the script are very welcome):

# CSV file variables
$path = Split-Path -parent "C:\Temp\ExportADUsers\*.*"
$filename = "AD_Users.csv"
$csvfile = $path + "\" + $filename
$csvdelimiter = ";"
$firstRowColumns = $true

# Active Directory variables
$searchbase = "OU=Users,DC=fabrikam,DC=com"
$ADServer = 'DC01'

# Database variables
$sqlserver = "DB02"
$database = "My Database"
$table = "tblADimport"
$tableEmployee = "tblEmployees"

# Initialize
Write-Host "Script started..."
$elapsed = [System.Diagnostics.Stopwatch]::StartNew()

# GET DATA FROM ACTIVE DIRECTORY

# Import the ActiveDirectory Module
Import-Module ActiveDirectory

# Get all AD users not in specified OU's
Write-Host "Retrieving users from Active Directory..."
$AllADUsers = Get-ADUser -server $ADServer `

-searchbase $searchbase -Filter * -Properties * |
?{$_.DistinguishedName -notmatch 'OU=MeetingRooms,OU=Users,DC=fabrikam,DC=com' `

    -and $_.DistinguishedName -notmatch 'OU=FunctionalMailbox,OU=Users,DC=fabrikam,DC=com'}

Write-Host "Users retrieved in $($elapsed.Elapsed.ToString())."

 

# Define labels and get specific user fields

Write-Host "Generating CSV file..."

$AllADUsers |
Select-Object @{Label = "UNID";Expression = {$_.objectGuid}},
@{Label = "FirstName";Expression = {$_.GivenName}},
@{Label = "LastName";Expression = {$_.sn}},
@{Label = "EmployeeNo";Expression = {$_.EmployeeID}} |

 

# Export CSV file and remove text qualifiers
Export-Csv -NoTypeInformation $csvfile -Encoding Unicode -Delimiter $csvdelimiter

Write-Host "Removing text qualifiers..."

(Get-Content $csvfile) | foreach {$_ -replace '"'} | Set-Content $csvfile

Write-Host "CSV file created in $($elapsed.Elapsed.ToString())."

 

# DATABASE IMPORT
[void][Reflection.Assembly]::LoadWithPartialName("System.Data")
[void][Reflection.Assembly]::LoadWithPartialName("System.Data.SqlClient")

$batchsize = 50000

 

# Delete all records in AD import table

Write-Host "Clearing records in AD import table..."

Invoke-Sqlcmd -Query "DELETE FROM $table" -Database $database -ServerInstance $sqlserver

 

# Build the sqlbulkcopy connection, and set the timeout to infinite
$connectionstring = "Data Source=$sqlserver;Integrated Security=true;Initial Catalog=$database;"
$bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connectionstring, [System.Data.SqlClient.SqlBulkCopyOptions]::TableLock)
$bulkcopy.DestinationTableName = $table
$bulkcopy.bulkcopyTimeout = 0
$bulkcopy.batchsize = $batchsize

 

# Create the datatable and autogenerate the columns
$datatable = New-Object System.Data.DataTable

 

# Open the text file from disk
$reader = New-Object System.IO.StreamReader($csvfile)
$columns = (Get-Content $csvfile -First 1).Split($csvdelimiter)
if ($firstRowColumns -eq $true) { $null = $reader.readLine()}

Write-Host "Importing to database..."

foreach ($column in $columns) { 
    $null = $datatable.Columns.Add()
}
 
# Read in the data, line by line
while (($line = $reader.ReadLine()) -ne $null)  {
   
    $null = $datatable.Rows.Add($line.Split($csvdelimiter))
 
    $i++; if (($i % $batchsize) -eq 0) { 
        $bulkcopy.WriteToServer($datatable) 
        Write-Host "$i rows have been inserted in $($elapsed.Elapsed.ToString())."
        $datatable.Clear() 
    } 

 
# Add in all the remaining rows since the last clear
if($datatable.Rows.Count -gt 0) {
    $bulkcopy.WriteToServer($datatable)
    $datatable.Clear()
}
 
# Clean Up
Write-Host "CSV file imported in $($elapsed.Elapsed.ToString())."
$reader.Close(); $reader.Dispose()
$bulkcopy.Close(); $bulkcopy.Dispose()
$datatable.Dispose()

 

# Sometimes the Garbage Collector takes too long to clear the huge datatable.
[System.GC]::Collect()

# Update tblEmployee with imported data
Write-Host "Updating employee data..."

$queryUpdateUsers = "UPDATE $($tableEmployee)
                     SET $($tableEmployee).EmployeeNumber = $($table).EmployeeNo,
                      $($tableEmployee).FirstName = $($table).FirstName,
                      $($tableEmployee).LastName = $($table).LastName, 
                     FROM $($tableEmployee) INNER JOIN $($table) ON $($tableEmployee).UniqueNumber = $($table).UNID

                     IF @@ROWCOUNT=0

                        INSERT INTO $($tableEmployee) (EmployeeNumber, FirstName, LastName,  UniqueNumber)

                            SELECT EmployeeNo, FirstName, LastName, UNID

                            FROM $($table)"

 

try{

    Invoke-Sqlcmd -ServerInstance $sqlserver -Database $database -Query $queryUpdateUsers

    Write-Host "Table $($tableEmployee) updated in $($elapsed.Elapsed.ToString())."

}

catch{

    Write-Host "An error occured when updating $($tableEmployee) $($elapsed.Elapsed.ToString())."

}

 

Write-Host "Script completed in $($elapsed.Elapsed.ToString())."


Viewing all articles
Browse latest Browse all 6937

Trending Articles