3com Tftp, 3com PXE, Pxelinux Menu, Dos.imz help
Sporadic lock-ups with ghost cast while trying to make use of a the following technologies. I followed a thread by mrguitar on how to create a pxelinux menu system that would allow me to boot dos, linux, winpe from a PXE menu.
My Client computers are Lenovo T410, X201, T510, T400’s
Client network card is Intel
My PXE server is 2003 r2
I use 3com PXE service, 3com TFTP. I have tried TFTPd32 but it did not fix the problem so I went back to 3coms TFTP service.
Using boot tab editor I have set a pxelinux.0 as the menu and use a default file to create the menu settings.
I am currently using a MS Dos imz file created by ghost boot wizard and converted to imz using winimage.
The boot image tells it to connect to a ghostcast session running on a Server 2008 R2 server.
What I am experiencing is 1 out of 5 times I ghost a machine it will lock up at random intervals of the ghosting. They always get the dos imz file and they always connect to the ghost cast session. However they could stop responding anywhere along the path of imaging I have seen it do it at 2% and 99% as well as every % between. When it stops responding I also cannot ping the machine, there is still network activity lights blinking occasionally and I can reboot the client by pressing crtl-alt-delete.
I have tried PC DOS and MS dos created by the ghost boot wizard and have not seen any improvement in reliability. I am using the newest ghost files from the 2269 package on both the boot disk and the ghost cast on the server 2008 R2.
My current feeling is this has something to do with the imz being pushed by linux and loading dos into memory.
However it seems others are not experiencing these issues as I have found it very hard to search for a solution. So perhaps there is some switch settings that need to be changed on the cisco switches but I am not a network engineer and do not have access to check them.
Any help would be great I have tried a lot of things and can post more information as requested.
Thank you
Comments
I would look at boot disks to eliminate the posibilty of the PXE
I would look at boot disks to eliminate the posibilty of the PXE. Once the ghost is loaded then the pxe and tftp are removed and everything is ran from memory. But it still could be related to the way the Pxe delivered boot environment is loaded and ran from ram.
If you test with cd or usb booted machines you could eliminate that the issue has any thing to do with PXE and the way the boot image is used. If after 5 ghost cast sessions you have no more problems it would seem you are on track for how the boot image is ran when delivered via PXE.
If you determine it is in the PXE environment you could test out microsoft's RIS or the later versions to see if they deal with the boot environment.
If you determine that the problem is still happening then you could test unicast rather than multicast in your cloneing. Multicast issues can have a lot of different resolsutions but you would want to know that this is what you are trouble shooting first before guessing to much more.
good luck
If you find this post helpful please give it a thumbs up!
If you find that this solves your problem please mark it as the solution!
ICHCB I will do just as you recommend and ghost to the client machines using a memory stick instead of using the PXE server. If I still get a lock up then as you say I can rule out an issue with PXE and look at other possible solutions.
I am already doing unicast so that wont be helpful.
I have completed the testing on using a Bootable CD instead of PXE. on the 4th machine rebuild it locked up. I think this validates that the problem is not from the pxe enviroment. The bootable CD is one we used to use long ago and never had any problems that I was aware of. There for I dont believe it was a faulty boot disk.
If its not the PXE or the Boot disk what would cause unicast ghost.exe to a ghostcast server to randomly stop listening?
Hi,
It looks like it is the way the image is loaded into the ram.It can also be related to the structure of the boot image. Therefore i would check the size of the image and probably try to create another one.
Unfortunately i can't tell what is exactly causing the issue and the best way to go as mentionned by ICHCB is to eliminate the possible causes of the issue.
Does it also happens on other type of hardware?
Why not trying booting the image from a full linux pxe environment and see if it is more stable. This way we could eliminate the pxe environment and the image as the cause of the issue
Here are some setups i did in the pastthat might help:
https://www-secure.symantec.com/connect/articles/b...
https://www-secure.symantec.com/connect/articles/h...
Cheers
Lohizune,
First question for size of image. I created the disk using the symantec ghost wizard. then opened the .sys file with winimage and saved it as a imz. The file size was around 2100 kb. I noticed in my winimage that it showed I could add up to 16 mb even thought I was only use 2100 kb. I changed the format of the image to a 2.88 disk and saved it again. Do you see anything wrong with how I produced the disk, the image size, or me changing the format to 2.88 since I was only using 2 ish.
Second question is about your linux pxe guide. I notice you install DHCP, our enviroment is not off network and there is already a DHCP server handing out addresses. that is why the 3com solution where it just used the networks dhcp. I want to follow your guide however I am unsure if that will add an additional dhcp server and cause some real headachs for the users on our network. Do you have a guide which takes into account your network has a DHCP server and hosts pxe to the subnet it is located in?
thanks
Loren
I will be testing booting from a disk to see if I can replicate the issue in house.
Hi,
I can see that you have made some good progress. Now the fact that it also fail at some random point using a boot disk makes me think that there might be irq conflicts between devices claiming the cpu at the same time. The easy way to check is to open a msinfo32 file in windows and look at the section 'Hardware Resource >> Conflict sharing. Disabling the conflicting device in the bios could then help.
Another simple test would be to eliminate potential disk or networking issues. For that i would try ghosting the laptop to a local drive such as a usb drive and see if the process is again interrupted.
In regards to my pxe setup documented above, you can skip the dhcp setup part as a dhcp is already serving your network. Start from step 2 and simply set options 66 and 67 in your dhcp scope options to indicate to your clients where to download the boot files. Option 66 =(tftp server ip) and option 67 (bootstrap file name) Here "pxeboot.0". This way it avoids useless pxe discovery traffic generated by the pxe clients and allow clients from other subnets to contact your pxe server.
Hope it helps
Also can you pull these machines off your network
Also can you pull these machines off your network? Can you set up the dhcp and pxe on a isolated network to test on? I have seen a lot of issues with network performance or packet colisions causing ghosting issues. It would be worth testing a little isolated network out now that you have eliminated PXE as the issue. You did note that the cd was an old one and it may be woth updateing the drivers on a new cd and testing again.
Cheers.
If you find this post helpful please give it a thumbs up!
If you find that this solves your problem please mark it as the solution!
Hi again ICHCB and Lohizune.
I have tried your newest sugguestions and have really narrowed down the items causing the issues.
Network Services sent over two of their best men to watch the traffic and find any errors. There was none the machines just drop off silently. There isnt even collisions. They asked me to put everything off network and see if it caused the issue still to validate that it isnt their network issue.
I put together a off network scenario today and tried to image on a single switch with boot CD's. The result was the same I would randomly get a locked up machine.
This got me to thinking I should try different laptops. I am having the issue on Lenovo T400's, T410s, X201s, T500, T510's. They all use the same intel driver.
I went and grabbed 10 of my T61 laptops about 3 years old and also use the intel driver. I didnt change the intel driver which I was using for the above machines. I proceeded with imaging these laptops about 5 times each so 50 successful reimages using my pxe network and production server. I was not able to cause a single lock up.
That being said we are left with Laptop hardware and Intel driver. I like the idea sugguested for hardware conflicts. I will go into windows after they boot and see if there is any hardware conflicts.
This may not get done till monday. So have a good weekend. thank you for the sugguestions.
Great test.
Intel NICs used to be rock solid but as of late they have gone very high performance high feature and started to each require their own driver or at least not like it was back in the day where 2 drivers would work one for 100 meg and one for the 1 gig nic and that would work for all intel NICs. Other venders did this in the past and that was one reason that the INTEL was worth the extra money. I still think they are worth the extra money but they don't have the rock solid aspect that they used to with just 2 drivers.
good luck.
If you find this post helpful please give it a thumbs up!
If you find that this solves your problem please mark it as the solution!
Would you like to reply?
Login or Register to post your comment.