Original URL: https://trevorsmale.github.io/techblog/post/pacu13/

System Hardening

Linux system hardening involves securing the system by reducing its attack surface through measures such as disabling unnecessary services, enforcing access controls, applying security patches, and using tools like OpenSCAP, STIG compliance frameworks, or the OSCAP Scanner. These tools help automate security audits, enforce compliance standards, and identify vulnerabilities to enhance system security.

Discussion Post 1:

Your security team comes to you with a discrepancy between the production security baseline and something that is running on one of your servers in production. There are 5 servers in a web cluster and only one of them is showing this behavior. They want you to account for why something is different.

How are you going to validate that the difference between the systems?

I am going to assume that I am new to the system in general and have very surface knowledge from fellow staff. I am also assuming we are working with a redhat based system.

Starting off simple

Maybe the problem is an obvious one, so I would just start off with a glance.

Quick cursory check (Kernel Version uname-a)
Manually Checking Logs (Journalctl, dmesg, audit.log, syslog)
Checking ports (Socket Statistics, ss -ntulp)
Listing installed packages (DNF list, RPM -qa)
Listing users and Logins (/etc/passwd, w, last)
Seeing what System D services are running (systemctl list-units - -type=service)
Digging for documentation and or commit history

Deeper ⛏️

If no low hanging fruit were there, then I would check configurations

Grub.conf, FirewallD/Apparmour, SELinux,

Sorting 🪰‘ish from 🌶️

If I do that see something distinctly different, I would employ a more sophisticated approach with difference checking. Given that everything is a structured file, I can append the output from a working system and the goose 🪿 to a new file and run diff against them.

Diff’ing the Logs
Diff’ing Socket Statistics
Diff’ing Installed Packages

What are you going to look at to explain this?

I think I have answered this above.

What could be done to prevent this problem in the future?

Introducing or Improving the change management policy and employing version control would be my first suggestion.
Ensuring that there is a build/test/deploy pipeline that integrates tightly with change management.
Using IaC and Automation to ensure consistency and repeatability with tools like Ansible, Packer, Podman or Kubernetes.
Hardening systems with either simple policies or through the guidance of STIG’s.
Introducing stronger controls over user privileges like employing RBA policies.

Discussion Post 2:

Your team has been giving you more and more engineering responsibilities. You are being asked to build out the next set of servers to integrate into the development environment. Your team is going from RHEL 8 to Rocky 9.4.

How might you start to plan out your migration?

Observe

Firstly I would gather system information

Benchmark/baseline performance metrics and utilization (Disk, I/O, PS, Connections etc)
Configs (Scripts and configuration files)
Installed Packages.
users (Listing users and privileges)
Policies (Firewall, SELinux)
Purpose (Assessing the use of a particular system to see if may need changes/upgrades)

Capture

I would snapshot the current system if possible
If a complete snapshot copy is not possible, I would gather files essential to rebuilding a replica

Reconstruct

Build it in a test VM emulating the current environment
Template the VM for experimental changes (Adding additional tools or Configs)

Analyze / Optimize

Gather business or operational requirements, perhaps the system needs enhancements
Experiment with performance tuning
Test new packages and/or configurations

Build

During the analysis and optimization phase, I would start a playbook with information gathered from previous phases. I would build and run the playbook against VM templates until satisfied.

Deploy

Given the prior phases, my Playbook would be robust and capable of the transition. However, I would ensure a robust backup and rollback plan in the case something fails.

What are you going to check on the existing systems to baseline your build?

Compute Usage
Memory Load
Disk Resources
Networking Metrics

What kind of validation plan might you use for your new Rocky 9.4 systems?

I would have a seperate playbook built that would validate performance against what I was observing during my VM experimentation. Though the environment may differ from that of the VM, I would still be able to discern performance characteristics and notice any outlier differences.

Digging Deeper

Run through this lab: https://killercoda.com/het-tanis/course/Linux-Labs/107-server-startup-process 👍

How does this help you better understand the discussion 13-2 question?

Well when I am gathering a picture of my current security baseline, I can use some of these tools like dmesg and ss to see what possible attack surface I may have.

Run through this lab: https://killercoda.com/het-tanis/course/Linux-Labs/203-updating-golden-image 👍

How does this help you better understand the process of hardening systems?

Reflection Questions

What questions do you still have about this week?
How can you apply this now in your current role in IT? If you’re not in IT, how can you look to put something like this into your resume or portfolio?

ProLUG Admin Course Unit 13 🐧