Eroding the Skills That Drive IT
Over the last few years as IT Staffing has been trimmed to minimal levels and as the adoption of cloud based services has risen in a dramatic fashion, the erosion of the basic skills, tools, and awareness of running a secure environment has steadily accelerated its pace. The lack of “IT Fundamentals” becomes eerily apparent as you open a web browser where the results of this oversight are apparent with the number of successful hacking related activities by folks with less than good intentions continue to grab the headlines. Simple things such as basic troubleshooting skills and asset management have been all but ignored, abandoned or left in such a state that their usefulness is questioned by all in the environment. Doubt creates mistrust, and mistrust creates unjustified blame.
I cannot recall the number of conversations related to “Am I protected against this latest threat by your endpoint solution?” that have the conversation abruptly end as I simply responded with “Does every endpoint in your environment that is vulnerable to this specific threat have our solution installed, configured properly and are they patched with all currently available patches?” The frequent answer is, well, I can check the WSUS Logs, then look against my SCCM reports, and then my Endpoint Solution reports, but I am not sure about the servers, I don’t have access to those." So I will respond with, “Well the answer is no” which elicits a response where the Customer becomes upset, typically confrontational and sometimes storms off with a few choice words. Core IT Fundamentals – if you can’t tell me what’s there how can I protect it?
My other favorite common scenario that best demonstrates this I call the ‘Google’ syndrome, not a knock on the vendor but the notion that every ‘software problem’ on a system is resolved by applying 3-4 problem resolution activities found by typing in the problem symptoms in a search window. This typically leads to both the application and system being left in a state where any real application resolution may not truly address the problem and may open the system(s) to other forms of compromise. We seem to be missing another core IT fundamentals - Do not do anything until you have done the following troubleshooting steps:
- Reboot (sounds simple but rarely done, especially in the server world), if problem still exists call vendor immediately but be prepared to be asked for and follow the steps below:
- Any other system exhibit this behavior?
- What were the last changes made to the system? (Inventory, asset management, change control – all IT Basics) – this one is ignored 99.9% of the time. Always the apps fault even though it was fine until that last patch when out or a configuration change in another tool was made.
- Have you backed out the last changes to the system? Did it resolve the problem?
- What do the systems have in common?
- Does a uninstall/re-install of the suspected applications resolve the problem?
The problem I see here is that the ‘quick’ fix that someone posted online, while it may resolve the problem it is typically used because we lack the fundamental skills, time and resources to make sure that the problem never occurs again with the vendors help or otherwise. Today’s typical IT worker simply has too many tasks that they are responsible for in any given day, stuck in countless redundant or unnecessary meetings and have not been properly trained to truly leverage a product to its full or optimum capacity. This gets compounded by the lack of time needed to truly validate that the tools that appear to be functioning properly get the proper care, maintenance and monitoring they need to be effective in the environment…yes this is more than making sure you can log in to the console and pull the latest status report. Tools left ignored and not monitored typically do not function effectively and leave serious gaps in the ability to monitor the environment for activities once again leaving huge holes for the bad actors to hide or obscure their activities in.
Don’t have the staff, budget or training dollars to implement the next upgrade? Moving infrastructure to Cloud based services typically compounds both the items above by creating a clear void of talent, visibility and understanding to address the technical issues and the need to now properly troubleshoot problems or properly secure data that now extends well beyond the accessible environment. How can you validate the controls leveraged by the cloud services provider if you don’t have the talent in house to do so? Most hire a third party, but this becomes a ‘one time’ action and to properly secure an environment you need to validate configurations and infrastructure components on a regular, if not real time basis. It’s bad enough that a cloud services provider typically will not, or cannot give you the patch level of every system your data resides on; in fact they claim this is part of securing your data. We spread the load across a number of internally/externally un-identifiable boxes so that no individual on our side can look or exploit it. So if you don’t know where the box is, then how can you validate that on every server where my data sits that the latest patches are applied?