Actions
The ungleich hardware maintenance guide » History » Revision 22
« Previous |
Revision 22/23
(diff)
| Next »
Nico Schottelius, 02/07/2024 12:00 PM
- Table of contents
- The ungleich hardware maintenance guide
The ungleich hardware maintenance guide¶
This guide describes common operations on hardware we use.
Using the ungleich-hardware container in kubernetes and docker¶
To manage hardware on server1 in kubernetes, you can use:
apiVersion: v1 kind: Pod metadata: name: ungleich-hardware spec: containers: - name: ungleich-hardware image: harbor.ungleich.svc.p10.k8s.ooo/ungleich-public/ungleich-hardware:0.0.3 args: - sleep - "1000000" volumeMounts: - mountPath: /dev name: dev securityContext: privileged: true nodeSelector: kubernetes.io/hostname: "server1" volumes: - name: dev hostPath: path: /dev
To use it wit docker:
docker run -v /dev:/dev --privileged -ti harbor.ungleich.svc.p10.k8s.ooo/ungleich-public/ungleich-hardware:0.0.3
APU Bios Update¶
- Download the correct bios from https://pcengines.github.io/
- Check whether it's apu1/2/3/4 before downloading
- Install flashrom
- Flash bios using flashrom
flashrom -w THEROMFILE -p internal
APU Serial and bootloader configuration¶
- Ensure that the bootloader has "console=ttyS0,115200" configured
- Ensure that there is a getty running on serial
- Use grub-bios as the bootloader
- Install using
grub-install /dev/sda
- Install using
Updating the Perc H800 SAS controller¶
wget 'https://dl.dell.com/FOLDER03292738M/3/SAS-RAID_Firmware_XKF5X_LN_12.10.7-0001_A13.BIN?uid=4b8a2506-f4d4-46a9-ab19-3c2a5008a782&fn=SAS-RAID_Firmware_XKF5X_LN_12.10.7-0001_A13.BIN' -O SAS-RAID_Firmware_XKF5X_LN_12.10.7-0001_A13.BIN
- chmod u+x SAS-RAID_Firmware_XKF5X_LN_12.10.7-0001_A13.BIN
- ./SAS-RAID_Firmware_XKF5X_LN_12.10.7-0001_A13.BIN
HP servers disk management¶
Required kernel modules:
sg cciss
Show all drives/controller overview:
hpacucli ctrl all show config hpacucli ctrl slot=0 pd all show
Add a disk as raid0:
hpacucli ctrl slot=0 create type=ld drives=1I:1:1 raid=0
Deleting a logical drive:
hpacucli ctrl slot=0 ld 2 delete
Copy from https://www.thegeekstuff.com/2014/07/hpacucli-examples/ (to cache it mainly):
1. Two ways to execute the command When you type the command hpacucli, it will display a “=>” prompt as shown below where you can enter all the hpacucli commands explained in the article. # hpacucli HP Array Configuration Utility CLI 9.20.9.0 Detecting Controllers...Done. Type "help" for a list of supported commands. Type "exit" to close the console. => rescan Or, if you don’t want to get to the hpacucli prompt, you can just enter the following directly in the Linux prompt. The following is exactly same as the above. # hpacucli rescan 2. Display Controller and Disk Status To display the detailed status of the controller and the disk status, execute the following command. # hpacucli => ctrl all show config Smart Array P410i in Slot 0 (Embedded) (sn: 50014380101D61C0) array A (SAS, Unused Space: 0 MB) logicaldrive 1 (136.7 GB, RAID 1, OK) physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 146 GB, OK) physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 146 GB, OK) unassigned physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SAS, 300 GB, OK) physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SAS, 300 GB, OK) physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SAS, 300 GB, OK) physicaldrive 2I:1:7 (port 2I:box 1:bay 7, SAS, 300 GB, OK) physicaldrive 2I:1:8 (port 2I:box 1:bay 8, SAS, 300 GB, OK) SEP (Vendor ID PMCSIERA, Model SRC 8x6G) 250 (WWID: 50014380101D61CF) In this example, as shown in the above output, we have total 7 physical drives. The first RAID group RAID 1 contains 2 physical drives and the remaining physical drives are not assigned to any of the logical drives. 3. View Controller Status To display the status of just the controller, do the following. In this example, the controller is working perfectly without any issues. => ctrl all show status Smart Array P410i in Slot 0 (Embedded) Controller Status: OK Cache Status: OK 4. View Drive Status To display the status of the physical drive, do the following. In this example, we have two 146GB physical drives, and 5 300GB physical drives, and all are in perfect condition. => ctrl slot=0 pd all show status physicaldrive 1I:1:1 (port 1I:box 1:bay 1, 146 GB): OK physicaldrive 1I:1:2 (port 1I:box 1:bay 2, 146 GB): OK physicaldrive 1I:1:3 (port 1I:box 1:bay 3, 300 GB): OK physicaldrive 1I:1:4 (port 1I:box 1:bay 4, 300 GB): OK physicaldrive 2I:1:6 (port 2I:box 1:bay 6, 300 GB): OK physicaldrive 2I:1:7 (port 2I:box 1:bay 7, 300 GB): OK physicaldrive 2I:1:8 (port 2I:box 1:bay 8, 300 GB): OK 5. View Individual Drive Status To display the detail status of a specific physical drive, do the following. In this example, we like to know the status of “pd” (physical disk) in slot 0. The specific disk is “2I:1:6”, which we figured it out from the output of the previous command. As shown in the output below, this displays the Serial Number, Make, Model, Size and Fireware version of this specific disk. This can be very helpful during troubleshooting. => ctrl slot=0 pd 2I:1:6 show detail Smart Array P410i in Slot 0 (Embedded) unassigned physicaldrive 2I:1:6 Port: 2I Box: 1 Bay: 6 Status: OK Drive Type: Unassigned Drive Interface Type: SAS Size: 300 GB Rotational Speed: 10000 Firmware Revision: HPD4 Serial Number: EB01PC416C4C1214 Model: HP EG0300FBDSP Current Temperature (C): 38 Maximum Temperature (C): 56 PHY Count: 2 PHY Transfer Rate: 6.0Gbps, Unknown 6. View All Logical Drives The following command will display all available logical drives on the system. As shown in the output below, we currently have only one logical drive in RAID 1 with total size of around 136GB. => ctrl slot=0 ld all show Smart Array P410i in Slot 0 (Embedded) array A logicaldrive 1 (136.7 GB, RAID 1, OK) 7. Create New RAID 0 Logical Drive Execute the following command to create a new logical drive using RAID 0 option. => ctrl slot=0 create type=ld drives=1I:1:3 raid=0 The above command creates a logical drive with the physical drives 1I:1:3 on RAID 0 configuration in slot 0. 8. Create New RAID 1 Logical Drive Execute the following command to create a new logical drive using RAID 1 option. => ctrl slot=0 create type=ld drives=1I:1:3,1I:1:4 raid=1 The above command creates a logical drive with the two physical drives 1I:1:3 and 1I:1:4 on RAID 1 configuration in slot 0. 9. Create New RAID 5 Logical Drive Execute the following command to create a new logical drive using RAID 5 option. => ctrl slot=0 create type=ld drives=1I:1:3,1I:1:4,2I:1:6,2I:1:7,2I:1:8 raid=5 The above command creates a logical drive with the five physical drives on RAID 5 configuration in slot 0. Once these logical drives are created, you should see the disks from the fdisk and you can format it from there and start using it. After you create a logical drive, execute the following command to verify that the LD got created. In this example, it shows that the RAID 5 logical drive got created successfully. => ctrl slot=0 ld all show status logicaldrive 1 (136.7 GB, RAID 1): OK logicaldrive 2 (1.1 TB, RAID 5): OK 10. Rescan for New Devices If you’ve added new physical hard disk, they won’t automatically show-up immediately. You have to scan for new devices as shown below. => rescan 11. View Detailed Logical Drive Status To display the detailed status of the logical drive, do the following: => ctrl slot=0 ld 2 show Smart Array P410i in Slot 0 (Embedded) array B Logical Drive: 2 Size: 1.1 TB Fault Tolerance: RAID 5 Heads: 255 Sectors Per Track: 32 Cylinders: 65535 Strip Size: 256 KB Full Stripe Size: 1024 KB Status: OK Caching: Enabled Parity Initialization Status: In Progress Unique Identifier: 600508B1001031303144363143301000 Disk Name: /dev/cciss/c0d1 Mount Points: None Logical Drive Label: A4967E2950014380101D61C008BE Drive Type: Data The above shows the RAID type, the disk name assigned to the logical drive, and other information about the logical drive number 2. 12. Delete Logical Drive To delete a logical drive with the number 2 use the below command. => ctrl slot=0 ld 2 delete Warning: Deleting an array can cause other array letters to become renamed. E.g. Deleting array A from arrays A,B,C will result in two remaining arrays A,B ... not B,C Warning: Deleting the specified device(s) will result in data being lost. Continue? (y/n) y 13. Add New Physical Drive to Logical Volume To add the new drives to existing logical volume, do the following. => ctrl slot=0 ld 2 add drives=2I:1:6,2I:1:7 In this example, we are adding two additional drives specified above to the logical volume number 2. 14. Add Spare Disks To add the spare disks to arrays that can be used in case of disk failures on one of the logical drives, do the following: => ctrl slot=0 array all add spares=2I:1:6,2I:1:7 In this example, we are adding two spare disks to the array. 15. Enable or Disable Cache The below commands enable or disable cache for the entire slot. => ctrl slot=0 modify dwc=disable => ctrl slot=0 modify dwc=enable 16. Erase Physical Drive Execute the following command to erase a physical drive in array B on slot 0. => ctrl slot=0 pd 2I:1:6 modify erase 17. Blink Physical Disk LED To blink the LED on the physical drives for the logical drive 2, do the following. This will make the LEDs blink on all the physical drives that belongs to logical drive 2. => ctrl slot=0 ld 2 modify led=on Once you know which drive belongs to logical drive 2, turn the LED blinking off as shown below. => ctrl slot=0 ld 2 modify led=off
Dell servers disk management (megacli)¶
Listing all disks:
megacli -PDList -aALL
Adding disks:
megacli -CfgLdAdd -r0 [Enclosure Device ID:slot] -aX (X : host is 0. md-array is 1) # Sample call, if enclosure and slot are KNOWN (aka not N/A) megacli -CfgLdAdd -r0 [32:0] -a0 # Sample call, if enclosure is N/A megacli -CfgLdAdd -r0 [:0] -a0
Remove cache of disks that are not in the server anymore:
megacli -DiscardPreservedCache -Lall -aAll
Remove foreign configurations on foreign disks
megacli -CfgForeign -Clear -aAll
Do both in many cases:
megacli -DiscardPreservedCache -Lall -aAll megacli -CfgForeign -Clear -aAll
Growing a raid6
megacli -ldrecon -Start -r6 -Add -PhysDrv[12:4] -l0 -a0
Deleting a logical drive
root@2157f4626763:/# megacli -CfgLdDel -L0 -a0 Adapter 0: Deleted Virtual Drive-0(target id-0) Exit Code: 0x00 root@2157f4626763:/#
SEE ALSO¶
Updated by Nico Schottelius 11 months ago · 22 revisions