Part 12. Structured data processing with PowerShell

24 November 2023 9 minutes Author: Cyber Witcher

Data Automation with PowerShell: Essential Skills and Commands

Structured data processing in PowerShell is a key skill for IT and cybersecurity professionals. PowerShell, Microsoft’s powerful tool for automating tasks and scripting, allows you to efficiently manipulate and process data. In this article, we explore how to use PowerShell for structured data processing, including reading, parsing, transforming, and saving data. We cover key commands and techniques to help you work with a variety of data formats, including text, CSV, JSON, and more. This skill is extremely useful for efficient data management and automation of complex tasks.

In information systems, data with a certain structure is widely used (configuration files, transaction logs, data from APls of external systems, etc.). PowerShell has built-in tools for working with the most common formats of this data. In this article, you’ll find information about structured data processing with PowerShell. We take a detailed look at the various PowerShell methods and commands for working with data, including reading, parsing, transforming, and saving to various formats. This article will provide you with practical knowledge and skills that you can use to improve data processing efficiency in your day-to-day work.

Work with CSV data

In the simplest case, a structured file contains tabular data and is a set of lines consisting of separate fields separated from each other by a certain symbol. Often a comma is chosen as a separator, which is why such files are called (Comma Separated Values, Comma Separated Values).

The first line of the CSV file may list the field names, while the other lines contain the data itself. Example:

Name, Lastname, Age
Иван,Петров,22
Сергей,Андреев,34
Оль га,Белова,2 5
Мария,Никитина,41

Let’s look at how PowerShell can be used to read and modify data in CSV files.

Reading from a CSV file

To consider examples, let’s create a test.csv file in the C:\Script directory with the following questionnaire data (name, surname, and age):

Name, Lastname, Age
Иван,Петров,22
Сергей,Андреев,34
Оль га,Белова,2 5
Мария,Никитина,41

In principle, the content, like any other text file, can be read into an array of strings using the Get-Content cmdlet.

PS C:\Script> $names = Get-Content .\test.csv
PS C:\Script> Get-Member -Inputobject $names
TypeName: System.Object[]
PS C:\Script> $names | Get-Member
TypeName: System.String
. z. .
PS C:\Script> $names
Name,Lastname,Age
Иван,Петров,22
Сергей,Андреев, 34
Оль га,Белова,2 5
Мария,Никитина, 41

However, the Get-Content cmdlet knows nothing about the internal structure of strings, so you have to parse the content to extract its constituent parts from the strings.

Therefore, to analyze structured data from it, it is better to use a special cmdlet Import-Csv, which immediately converts the data from the file into an array of objects with the required fields.

Using Import-Csv, we read the data from the test.csv file into the $names variable and display its contents:

PS C:\Script> $names = Import-Csv -Path .\test.csv
PS C:\Script> $names
Name Lastname Age
Иван Петров 22
Сергей Андреев 34
Ольга Белова 25
Мария Никитина 41

As you can see, the Import-Csv cmdlet understood that the first line of the file contains a header that defines the structure of each line. As with Get-Content, the $names variable is an array of objects:

PS C:\Script> Get-Member -Inputobject $names
TypeName: System.Object[]
Name MemberType Definition
Count AliasProperty Count = Length
Add
Address
Method
Method
int IList.Add(System...
System.Objects, msco...

However, now each element of the $names array is not a simple string, but an object of type PSCustomObject that has Name, LastName, and Ade fields:

PS C:\Script> $nanes | Get-Member
TypeName: System.Management.Automation.PSCustomObject
Name MemberType Definition
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
Age NoteProperty string Age=22
Lastname NoteProperty string Lastname=neTpoB
Name NoteProperty string Name=HBaH

Using this array, we can easily search, sort, or group data using the cmdlets discussed in Chapter 5. For example, let’s sort the records in our table by age:

PS C:\Script> $names | Sort-Object Age
Name Lastname Age
Иван Петров 22
Ольга Белова 25
Сергей Андреев 34
Мария Никитина 41

Field names during import can be changed using -Header:

PS C:\Script> Import-Csv -Path .\test.csv -Header 'Имя', 'Фамилия', 'Возраст'
Имя Фамилия Возраст
Name Lastname Age
Иван Петров 22
Сергей Андреев 34
Ольга Белова 25
Мария Никитина 41

By default, import-csv considers a comma as the delimiter. If a different character is used as a delimiter in the file, it must be specified in the -Delimiter parameter. For example, let’s replace commas in the test.csv file with a semicolon symbol:

PS C:\Script> (Get-Content .\test.csv -Raw).Replace(',', ';') | Set-Content
.\test.csv
PS C:\Script> Get-Content .\test.csv
Name;Lastname;Age
Иван;Петров;22
Сергей;Андреев;3 4
Оль га;Белова;2 5
Мария;Никитина;41
Імпортувати такий файл можна наступною командою:
PS C:\Script> Import-Csv -Path .\test.csv -Delimiter
Name Lastname Age
Иван Петров 22
Сергей Андреев 34
Ольга Белова 25
Мария Никитина 41

Запис у файл CSV

You can use the Export-Csv cmdlet to generate a CSV file from an array of PowerShell objects. For example, let’s get a list of all services running on the local computer (status, name and display name):

Let’s save this list in the $services variable:

PS C:\Script> $services = Get-Service | Where-Object Status -eq Running |
Select-Object Status, Name, DisplayName
Експортуємо масив об'єктів $services до CSV-файлу services.csv:
PS C:\Script> $services | Export-Csv -Path .\services.csv

Let’s check the contents of the services.csv file:

PS C:\Script> Get-Content .\services.csv
#TYPE Selected.System.ServiceProcess.Servicecontroller
"Status","Name","DisplayName"
"Running","AdobeARMservice","Adobe Acrobat Update Service"
"Running","AMD External Events Utility","AMD External Events Utility"
"Running","AMD FUEL Service","AMD FUEL Service"

As you can see, the file was created correctly, but its first line contains the type of objects from which it was obtained (in our case, these are objects of type Se1ected). System. ServiceProcess . ServiceContr011er). If you do not need this information in the file, you should specify the -NoTypeInformation option when calling Export-Csv:

PS C:\Script> $services | Export-Csv -Path .\services.csv -NoTypeinformation

In this case, the first line in the file will contain the field names:

PS C:\Script> Get-Content .\services.csv
"Status","Name","DisplayName"
"Running","AdobeARMservice","Adobe Acrobat Update Service"
"Running","AMD External Events Utility","AMD External Events Utility"
"Running","AMD FUEL Service","AMD FUEL Service"
"Running","Appinfo","Application Information"

Data processing without access to the file

PowerShell has the ConvertFrom-Csv and ConvertTo-Csv cmdlets, which work similarly to ImportFrom-Csv and ExportTo-Csv, but do not access external files.

For example, let’s repeat our example with a list of running services:

PS C:\Script> $services = Get-Service | Where-Object Status -eq Running |
Select-Object Status, Name, DisplayName

We convert the array of $services objects to CSV:

PS C:\Script> $csv = $services | ConvertTo-Csv

The $csv variable contains a string in CSV format:

PS C:\Script> $csv | Get-Member
TypeName: System. String
PS C:\Script> $csv
#TYPE Selected.System.ServiceProcess.Servicecontroller
"Status", "Name","DisplayName"
"Running","AdobeARMservice","Adobe Acrobat Update Service"
"Running","AMD External Events Utility","AMD External Events Utility"
"Running","AMD FUEL Service","AMD FUEL Service"

Now convert the $csv string into the $objs variable:

PS C:\Script> $objs = $csv | ConvertFrom-Csv

The Sobjs array will contain objects of type Selected. System. Serviceprocess. The Servicecontroller that was specified in the #type comment:

PS C:\Script> Get-Mamber -Inputobject $objs
TypeName: System.Object[]
PS C:\Script> $objs | Get-Member
TypeName: CSV:Selected.System.ServiceProcess.Servicecontroller
PS C:\Script> $objs
Status Name DisplayName
Running AdobeARMservice Adobe Acrobat Update Service
Running AMD External Events Utility AMD External EventsUtility
Running AMD FUEL Service AMD FUEL Service

Data processing in JSON format

Data in JSON (JavaScript Object Notation) format is widely used in web development, as well as for storing application settings. You CAN work with JSON in PowerShell USING the ConvertFrom-Json and ConvertTo-Json cmdlets. Note Unlike CSV, there are no special cmdlets for importing/exporting JSON files in PowerShell.

Let’s convert to JSON the contents of our CSV file C:\Script\test.csv. To do this, we first import this file into an array of PowerShell objects, and then convert this array to JSON by passing it through the pipeline to the convertToJson cmdlet:

PS C:\Script> Import-Csv -Path .\test.csv
Name Lastname Age
Иван Петров 22
Сергей Андреев 34
Ольга Белова 25
Мария Никитина 41
PS C:\Script> Import-Csv -Path .\test.csv | ConvertTo-Json
[
{
"Name": "Иван",
"Lastname": "Петров",
"Age": "22"
},
"Name": "Сергей",
"Lastname": "Андреев",
"Age": "34"
{
"Name": "Ольга",
"Lastname": "Белова",
"Age": "25"
},
(
"Name": "Мария",
"Lastname": "Никитина",
"Age": "41"
}
]

We make sure that the result is a line:

PS C:\Script> $json = Import-Csv -Path .\test.csv | ConvertTo-Json
PS C:\Script> Get-Member -Inputobject $json
TypeName: System.String

By default, JSON-cTpoka is generated in a structured form that is convenient for human understanding: nested elements are indented, fields are separated by line breaks. If the JSON is processed automatically, you can use the -Compress option to remove unnecessary spaces and line breaks for formatting. In this case, fields and values are written sequentially, and JSON-cTpoka will have a minimum length:

PS C:\Script> Import-Csv -Path .\test.csv | ConvertTo-Json -Compress
[{"Name":"Иван","Lastname":"Петров","Age":"22"},{"Name":"Сергей","Lastname":
"Андреев","Age":"34"},{"Name":"Ольга","Lastname":"Белова","Age":"25"},{"Name":"
Мария","Lastname":"Никитина","Age":"41"}]

To create a file in JSON format, the string obtained as a result of conversion must be saved using the set-content or Out-File cmdlets:

PS C:\Script> Inport-Csv -Path .\test.csv | ConvertTo-Json | Set-Content
.\test.json

Note that unlike exporting to a CSV file, when converting an array of objects to JSON, information about the initial type of objects is lost. To illustrate this, let’s create the $services array again, which will store the list of services running on the local computer:

PS C:\Script> $services = Get-Service | Where-Object Status -eq Running |
Select-Object Status, Name, DisplayName
Елементами цього масиву будуть об'єкти типу Selected.System.Serviceprocess.
Servicecontroller:
PS C:\Script> $services | Get-Member
TypeName: Selected.System.ServiceProcess.Servicecontroller
Перетворимо масив $services в JSON и сохраним результат в переменной $json_
services:
PS C:\Script> $json_services = $services | ConvertTo-Json

Now let’s perform the reverse conversion – convert the $json_services string into the $objs array USING the ConvertFrom-Json cmdlet:

PS C:\Script> $objs = $json_services | ConvertFrom-Json

Let’s check what type the elements of the $objs array have:

PS С:\Users\andrv> $objs | Get-Member
TypeName: System.Management.Automation.PSCustomObject
Name MemberType Definition
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
DisplayName NoteProperty string DisplayName=Adobe Acrob
Name NoteProperty string Name=AdobeARMservice
Status NoteProperty int Status=4

As you can see, objects of the type System.Management.Automation.PSCustomObject are created as a result of conversion from the JSON string $json_services. These objects have the same fields as the original Selected objects. System. Serviceprocess.servicecontroller, however, specific methods for controlling services in JSON-converted objects will not be available.

Results

  • ConvertFrom-*, ConvertTo-* cmdlets are used to work with the most common PowerShell structured data formats.

  • CSV files can be read and modified using the import-csv and export-csv cmdlets.

Other related articles
Found an error?
If you find an error, take a screenshot and send it to the bot.