Structured data processing in PowerShell is a key skill for IT and cybersecurity professionals. PowerShell, Microsoft’s powerful tool for automating tasks and scripting, allows you to efficiently manipulate and process data. In this article, we explore how to use PowerShell for structured data processing, including reading, parsing, transforming, and saving data. We cover key commands and techniques to help you work with a variety of data formats, including text, CSV, JSON, and more. This skill is extremely useful for efficient data management and automation of complex tasks.
In information systems, data with a certain structure is widely used (configuration files, transaction logs, data from APls of external systems, etc.). PowerShell has built-in tools for working with the most common formats of this data. In this article, you’ll find information about structured data processing with PowerShell. We take a detailed look at the various PowerShell methods and commands for working with data, including reading, parsing, transforming, and saving to various formats. This article will provide you with practical knowledge and skills that you can use to improve data processing efficiency in your day-to-day work.
In the simplest case, a structured file contains tabular data and is a set of lines consisting of separate fields separated from each other by a certain symbol. Often a comma is chosen as a separator, which is why such files are called (Comma Separated Values, Comma Separated Values).
The first line of the CSV file may list the field names, while the other lines contain the data itself. Example:
Name, Lastname, Age Иван,Петров,22 Сергей,Андреев,34 Оль га,Белова,2 5 Мария,Никитина,41
Let’s look at how PowerShell can be used to read and modify data in CSV files.
To consider examples, let’s create a test.csv file in the C:\Script directory with the following questionnaire data (name, surname, and age):
Name, Lastname, Age Иван,Петров,22 Сергей,Андреев,34 Оль га,Белова,2 5 Мария,Никитина,41
In principle, the content, like any other text file, can be read into an array of strings using the Get-Content cmdlet.
PS C:\Script> $names = Get-Content .\test.csv PS C:\Script> Get-Member -Inputobject $names TypeName: System.Object[] PS C:\Script> $names | Get-Member TypeName: System.String . z. . PS C:\Script> $names Name,Lastname,Age Иван,Петров,22 Сергей,Андреев, 34 Оль га,Белова,2 5 Мария,Никитина, 41
However, the Get-Content cmdlet knows nothing about the internal structure of strings, so you have to parse the content to extract its constituent parts from the strings.
Therefore, to analyze structured data from it, it is better to use a special cmdlet Import-Csv, which immediately converts the data from the file into an array of objects with the required fields.
Using Import-Csv, we read the data from the test.csv file into the $names variable and display its contents:
PS C:\Script> $names = Import-Csv -Path .\test.csv PS C:\Script> $names Name Lastname Age Иван Петров 22 Сергей Андреев 34 Ольга Белова 25 Мария Никитина 41
As you can see, the Import-Csv cmdlet understood that the first line of the file contains a header that defines the structure of each line. As with Get-Content, the $names variable is an array of objects:
PS C:\Script> Get-Member -Inputobject $names TypeName: System.Object[] Name MemberType Definition Count AliasProperty Count = Length Add Address Method Method int IList.Add(System... System.Objects, msco...
However, now each element of the $names array is not a simple string, but an object of type PSCustomObject that has Name, LastName, and Ade fields:
PS C:\Script> $nanes | Get-Member TypeName: System.Management.Automation.PSCustomObject Name MemberType Definition Equals Method bool Equals(System.Object obj) GetHashCode Method int GetHashCode() GetType Method type GetType() ToString Method string ToString() Age NoteProperty string Age=22 Lastname NoteProperty string Lastname=neTpoB Name NoteProperty string Name=HBaH
Using this array, we can easily search, sort, or group data using the cmdlets discussed in Chapter 5. For example, let’s sort the records in our table by age:
PS C:\Script> $names | Sort-Object Age Name Lastname Age Иван Петров 22 Ольга Белова 25 Сергей Андреев 34 Мария Никитина 41
Field names during import can be changed using -Header:
PS C:\Script> Import-Csv -Path .\test.csv -Header 'Имя', 'Фамилия', 'Возраст' Имя Фамилия Возраст Name Lastname Age Иван Петров 22 Сергей Андреев 34 Ольга Белова 25 Мария Никитина 41
By default, import-csv considers a comma as the delimiter. If a different character is used as a delimiter in the file, it must be specified in the -Delimiter parameter. For example, let’s replace commas in the test.csv file with a semicolon symbol:
PS C:\Script> (Get-Content .\test.csv -Raw).Replace(',', ';') | Set-Content .\test.csv PS C:\Script> Get-Content .\test.csv Name;Lastname;Age Иван;Петров;22 Сергей;Андреев;3 4 Оль га;Белова;2 5 Мария;Никитина;41 Імпортувати такий файл можна наступною командою: PS C:\Script> Import-Csv -Path .\test.csv -Delimiter Name Lastname Age Иван Петров 22 Сергей Андреев 34 Ольга Белова 25 Мария Никитина 41
You can use the Export-Csv cmdlet to generate a CSV file from an array of PowerShell objects. For example, let’s get a list of all services running on the local computer (status, name and display name):
Let’s save this list in the $services variable:
PS C:\Script> $services = Get-Service | Where-Object Status -eq Running | Select-Object Status, Name, DisplayName Експортуємо масив об'єктів $services до CSV-файлу services.csv: PS C:\Script> $services | Export-Csv -Path .\services.csv
Let’s check the contents of the services.csv file:
PS C:\Script> Get-Content .\services.csv #TYPE Selected.System.ServiceProcess.Servicecontroller "Status","Name","DisplayName" "Running","AdobeARMservice","Adobe Acrobat Update Service" "Running","AMD External Events Utility","AMD External Events Utility" "Running","AMD FUEL Service","AMD FUEL Service"
As you can see, the file was created correctly, but its first line contains the type of objects from which it was obtained (in our case, these are objects of type Se1ected). System. ServiceProcess . ServiceContr011er). If you do not need this information in the file, you should specify the -NoTypeInformation option when calling Export-Csv:
PS C:\Script> $services | Export-Csv -Path .\services.csv -NoTypeinformation
In this case, the first line in the file will contain the field names:
PS C:\Script> Get-Content .\services.csv "Status","Name","DisplayName" "Running","AdobeARMservice","Adobe Acrobat Update Service" "Running","AMD External Events Utility","AMD External Events Utility" "Running","AMD FUEL Service","AMD FUEL Service" "Running","Appinfo","Application Information"
PowerShell has the ConvertFrom-Csv and ConvertTo-Csv cmdlets, which work similarly to ImportFrom-Csv and ExportTo-Csv, but do not access external files.
For example, let’s repeat our example with a list of running services:
PS C:\Script> $services = Get-Service | Where-Object Status -eq Running | Select-Object Status, Name, DisplayName
We convert the array of $services objects to CSV:
PS C:\Script> $csv = $services | ConvertTo-Csv
The $csv variable contains a string in CSV format:
PS C:\Script> $csv | Get-Member TypeName: System. String PS C:\Script> $csv #TYPE Selected.System.ServiceProcess.Servicecontroller "Status", "Name","DisplayName" "Running","AdobeARMservice","Adobe Acrobat Update Service" "Running","AMD External Events Utility","AMD External Events Utility" "Running","AMD FUEL Service","AMD FUEL Service"
Now convert the $csv string into the $objs variable:
PS C:\Script> $objs = $csv | ConvertFrom-Csv
The Sobjs array will contain objects of type Selected. System. Serviceprocess. The Servicecontroller that was specified in the #type comment:
PS C:\Script> Get-Mamber -Inputobject $objs TypeName: System.Object[] PS C:\Script> $objs | Get-Member TypeName: CSV:Selected.System.ServiceProcess.Servicecontroller PS C:\Script> $objs Status Name DisplayName Running AdobeARMservice Adobe Acrobat Update Service Running AMD External Events Utility AMD External EventsUtility Running AMD FUEL Service AMD FUEL Service
Data in JSON (JavaScript Object Notation) format is widely used in web development, as well as for storing application settings. You CAN work with JSON in PowerShell USING the ConvertFrom-Json and ConvertTo-Json cmdlets. Note Unlike CSV, there are no special cmdlets for importing/exporting JSON files in PowerShell.
Let’s convert to JSON the contents of our CSV file C:\Script\test.csv. To do this, we first import this file into an array of PowerShell objects, and then convert this array to JSON by passing it through the pipeline to the convertToJson cmdlet:
PS C:\Script> Import-Csv -Path .\test.csv Name Lastname Age Иван Петров 22 Сергей Андреев 34 Ольга Белова 25 Мария Никитина 41 PS C:\Script> Import-Csv -Path .\test.csv | ConvertTo-Json [ { "Name": "Иван", "Lastname": "Петров", "Age": "22" }, "Name": "Сергей", "Lastname": "Андреев", "Age": "34" { "Name": "Ольга", "Lastname": "Белова", "Age": "25" }, ( "Name": "Мария", "Lastname": "Никитина", "Age": "41" } ]
We make sure that the result is a line:
PS C:\Script> $json = Import-Csv -Path .\test.csv | ConvertTo-Json PS C:\Script> Get-Member -Inputobject $json TypeName: System.String
By default, JSON-cTpoka is generated in a structured form that is convenient for human understanding: nested elements are indented, fields are separated by line breaks. If the JSON is processed automatically, you can use the -Compress option to remove unnecessary spaces and line breaks for formatting. In this case, fields and values are written sequentially, and JSON-cTpoka will have a minimum length:
PS C:\Script> Import-Csv -Path .\test.csv | ConvertTo-Json -Compress [{"Name":"Иван","Lastname":"Петров","Age":"22"},{"Name":"Сергей","Lastname": "Андреев","Age":"34"},{"Name":"Ольга","Lastname":"Белова","Age":"25"},{"Name":" Мария","Lastname":"Никитина","Age":"41"}]
To create a file in JSON format, the string obtained as a result of conversion must be saved using the set-content or Out-File cmdlets:
PS C:\Script> Inport-Csv -Path .\test.csv | ConvertTo-Json | Set-Content .\test.json
Note that unlike exporting to a CSV file, when converting an array of objects to JSON, information about the initial type of objects is lost. To illustrate this, let’s create the $services array again, which will store the list of services running on the local computer:
PS C:\Script> $services = Get-Service | Where-Object Status -eq Running | Select-Object Status, Name, DisplayName Елементами цього масиву будуть об'єкти типу Selected.System.Serviceprocess. Servicecontroller: PS C:\Script> $services | Get-Member TypeName: Selected.System.ServiceProcess.Servicecontroller Перетворимо масив $services в JSON и сохраним результат в переменной $json_ services: PS C:\Script> $json_services = $services | ConvertTo-Json
Now let’s perform the reverse conversion – convert the $json_services string into the $objs array USING the ConvertFrom-Json cmdlet:
PS C:\Script> $objs = $json_services | ConvertFrom-Json
Let’s check what type the elements of the $objs array have:
PS С:\Users\andrv> $objs | Get-Member TypeName: System.Management.Automation.PSCustomObject Name MemberType Definition Equals Method bool Equals(System.Object obj) GetHashCode Method int GetHashCode() GetType Method type GetType() ToString Method string ToString() DisplayName NoteProperty string DisplayName=Adobe Acrob Name NoteProperty string Name=AdobeARMservice Status NoteProperty int Status=4
As you can see, objects of the type System.Management.Automation.PSCustomObject are created as a result of conversion from the JSON string $json_services. These objects have the same fields as the original Selected objects. System. Serviceprocess.servicecontroller, however, specific methods for controlling services in JSON-converted objects will not be available.
ConvertFrom-*, ConvertTo-* cmdlets are used to work with the most common PowerShell structured data formats.
CSV files can be read and modified using the import-csv and export-csv cmdlets.