Your Experiences with AV in a Virtual Desktop Environment
Greetings all,
I'm just wondering what everyone else is doing for AV in a VDI scenario. We're rolling with 750+ virtual machines running XP, on about 42 spindles. We get rocked with every def updates if they occur at the same time. Normally this isn't an issue for us as it occurs in the mornings around 2, but we had an issue that brought us to our knees last week, where the def update didn't occur until early morning right about the same time users hopped on. It was a bad day.an
So, we're working to trim our windows down. I'm not our VDI guy, just a general systems guy, but I'm wondering how people handle this issue. I'm thinking of manually breaking up groups to force seperate policies that load defs in offset time windows throughout the night.
Just wondering what the rest of the world does. I don't know alot of folks with VDI environments like ares, so I look forward to hearing other's experiences in this area.
Thanks for your time,
-Ryan
Comments
We use MR4 MP2
We use MR4 MP2 (antivirus/spyware only) on all of our Vmware guest servers. We are also testing antivirus/antipyware with network threat protection (only using IPS). Performance has not been an issue. There is one issue with the client taking up high cpu when no user is logged into a Vmware guest, but it hasn't been a huge issue for us.
We do use VMware view just a
We do use VMware view just a bit and are heavy users of virtual server technologies, but I haven't seen any issue with A/V or definition updates in either system. My first inclination is that your systems are running very close to full capicity and any event might be enough to bring the system down should it occur during the workday. Keep in mind that i/o capacity applies to all aspects, disk, ram, cpu, and network. Often 3 of the 4 are doing just fine but one might be strained. I'd check the preformance logs that should be kept by the system during such an update and see what area is suffering the most. RAM and CPU are easy fixes (more systems, add ram), network and disk can be a bit trickier.
Eric C. Lukens IT Security Policy and Risk Assessment Analyst University of Northern Iowa
No Where Near Capacity... except one
we're actually built out quite a bit, capacity for CPU, RAM and net not being an issue at all. I know that our bottle neck and i/o issue is our SANs right now, it just seems like the should be able to handle the load. We have plenty of space and about 10gb of net interface to our 3 SANs.
We have 3 Equilogic SANs, about 16tb of total storage, all on 15k drives. The defs end up being about 112mb unpackaged. Anyone think pushing that down 750+ times across 42 spindles is enough to cripple our entire VM/SAN system for 2-3 hours? It just seems like they should be able to handle that more rapidly/smoothly.
Thanks again for your time and brain cycles. :)
Capacity is not the issue it's IOPS. Assuming your 15 drives are capable of 130 IOPS per drive you have 42X140 or 5880 IOPS available. Assuming 30 IOPS per VM you require 30X750 or 22500 IOPS. So you are short a massive amount of disk speed. My suggestion is to use a tool such as Liquid Ware Labs or Lakeside to determine your peak IOPS. Then look at how many more disks you need to buy. 30 IOPS is very conservative so don't be shocked if you find out you will need 200 plus more disks
You might want to
You might want to troubleshoot with Equilogic, hopefully you have a good support contract with them. Since Dell owns Equilogic and Dell sells Vmware software, we've found Dell is very helpful in troubleshooting issues with VMware.
Eric C. Lukens IT Security Policy and Risk Assessment Analyst University of Northern Iowa
Yea, while I've not delt with
Yea, while I've not delt with them directly, I hear that we've had good response for them. I'm trying to investigate and test as much as I can prior to calling them, since it's not really my system. We need to issue fixed, but the proper resources aren't being allocated to do so. :-/ *sigh* One would think it'd be our only issue when users can't log in for 4 hours, but when you get a bandaid applied, everyone wants to ignore that it's still bleeding underneath and move on.
I've setup the LU server so that it will only pull the defs down at midnight. Clients checkin every hour, so even if I have 4 hours of load, it won't be later than 5 before the environment is usable again.
Now we bleed.
Actually, depending on how
Actually, depending on how you have things configured, push or pull, you should not see any impact, virtual or physical.
All of our servers are virtual, no impact.
Our desktops so far are physical set to pull. since not all computers come on or are logged into at the EXACT same time, even at 7 minutes, there's little chance of all 300 computers getting defs at the EXACT same moment. We don't have huge bandwidth, and can happily say that no one seems to even notice when they get new defs, multiple times a day.
If you have 20 computers in a location, and you have the heartbeat set to 7 minutes, and it's a pull, so the client has to check in to see "what's new", what are the odds of all 20 lining up perfectly in that 7 minute window?
Pretty slim.........
Apparently virtual works differently?
Do you have enough defs version being saved so that they don't get full defs each time there's an update?
Are they set to push or pull?
My sites - http://theamcpages.com & http://antique-engines.com
Toy:
Shadow:
Do you have selected the
Do you have selected the option to scan when new defenitions arrive? We don't have 750 VMs, we have around 100. The defenition updates do not impact us.
What impacts us in the VM world is when all the clients perform their scheduled scan at the same time. Antivirus vendors need to detect virtual hardware and give us a "smart" scan option which spreads the scan over a longer period of time, thus using less disk, memory, and iSCSI network utilization. A "radomize" feature on the start of the scan too. I don't want to create seperate policies for each and every VM guest.
Hi, we have a VMware View 4
Hi, we have a VMware View 4 environment with about 75 VMs in a proof of concept scenario and are running SAV 10.1.9.9000
This morning all of the VMs CPU went to nearly 100% CPU using RTVscan.exe. The SAV clients are managed by a centeral SSC console thats outside the VMware View control, typically only configured for physical VMs. Is there anything on this version of the client that we should be looking out for and/or to change?
Cheers,
Paul.
Paul,
Your CPU hit 100% due to storage disk latency. The hypervisor needs to schedule IOPS writes to disk and tells the guests to wait their turn. This then causes the client to go into a holding pattern looking for it's hard drive. The net result is latency which causes the CPU to spike.
You can validate my comments by running an auditing tool such as Liquid Ware Labs or Lakeside.
SEP RU6 has a feature you can
SEP RU6 has a feature you can enable to randomize the scan start times. Very helpful for VMs.
Endpoint Knowledge Base
Security Best Practices
Reporting into SEPM console...
We are using SEP v11 RU6A. I want to install the VDI clients with SEPM management in mind. I would prefer to have the clients report status to my SEPM console without creating any problems.
As everyone knows, when the user logs off of a transient VDI session all settings are lost, including sephwid.xml as well as the Hardware ID and Device IDs in the registry. Several possible problems come to mind, including 1) duplicate entries in the SEPM for the VDI clients, 2) VDI clients getting the wrong policies and having issues with antivirus updates, 3) resource problems with scanning, etc.
I have SEP v11 RU6A deployed to nearly 2,000 workstations & laptops and around 95 servers. I am searching for a procedure to easily install the SEP client software for testing on the “Golden Image” of the VDI system, and have them report into SEPM. Does anyone have documentation that outlines such a procedure?
Thanks,
jdk1965
jdk1965, We are using VMware
jdk1965,
We are using VMware View 4 and have successfully deployed SEP RU6MP1 (and earlier versions) with no problems. Just remember to delete the HardwareID key from the registry and delete the sephwid.xml file from your base image before cloning. You will not have duplicate entries in SEPM (at least we don't). Our thin clients are non-persistent.
Rick
Full defs?
The interesting thing is that the clients are pulling full definitions. Why are deltas not being provided? Are all of the machines going out to the cloud for their updates?
Disk I/O is usually the culprit for poor VM performance. This is doubly true during scans. Also make sure that you do not have the option checked for "Scan after new defs arrive". That will hit your SAN hard.
Senior Consultant @ Creative Breakthroughs, Inc. a Symantec Platinum Partner
http://www.cbihome.com/
Similar experience to OP
We have ~200 virtual XP desktops running on VMware & have experienced a similar performance drop to the original poster (rstaats) when client definition updates occur.
After reading the Symantec White Paper for Virtual Environments (http://eval.symantec.com/mktginfo/enterprise/white_papers/b-endpoint_protection_virtualization_best_practices.pdf), we changed the clients to Pull Mode, and set the randomisation window to 2 hours.
Testing at the time showed this worked (i.e. the clients updated at random times within the 2 hour window), and once in production, several definition update cycles (most occurring during working hours) went through with no noticeable performance degradation.
However, this morning, the same problem occurred. The download randomisation still appears to be working as expected, so I have now increased the window to 12 hours, in the hope that this will prevent any overlap and stop the problem recurring.
One question I have: is there anything that might force the clients to download full definitions (rather than just delta), as if this is the case, it might explain why some def updates are fine, and others create a big performance hit.
Anyone have any comments/suggestions?
Would you like to reply?
Login or Register to post your comment.