WYSE terminal causing DHCP issues

One of our sites has had numerous intermittent DHCP issues. The symptoms were varied and unpredictable. Normal Windows clients would sometimes fail to lease an IP address successfully, preventing users from logging onto the domain. Usually, the client would eventually get an address, but during times of high utilization this could sometimes take many minutes (if it worked at all). The issue came to a head when doing our summer deployments. Our imaging process consists of booting from a CD/floppy or PXE and joining a Ghost multicast session. Not getting an IP address from DHCP was comlpetely halting work Our technicians had to manually assign each client an IP address and remember which ones were already used. Needless to say tensions and blood pressure were high.

Previously I had tried troubleshooting the problem by updating to the latest firmware on our switches, checking their configs and trying to rule out any problems on the DHCP server. These were all dead ends. I couldn’t see anything strange in the packet traces, and was runnning out of ideas. One of our technicians however noticed that if he booted a system to Windows and let it get an IP address first, the BootCD would then grab the same address and everything would work. I decided to latch onto this and dig deeper. I compared traces from “cold” booted machines, and “warm” (boot to Windows first) booted machines. At first I couldn’t find anything. but that was because I was only looking at BootP messages.

To try and cut down on the amount of traffic I was capturing, I set my capture filter to only grab UDP. In doing this, I also saw ARP requests coming from the DHCP clients. The machines that booted fine followed a process like this:

  1. (client) Discover
  2. (server) Offer
  3. (client) Request
  4. (server) ACK
  5. (client) ARP for offered IP
  6. (client) ARP for offered IP
  7. (client) No response to ARP – claim IP

They had no trouble getting an IP because Windows had already done all the hard work of collision detection. I unfortunately did not capture traffic from a Windows client in this environment. It would have been nice to see how windows handles this. The failing (cold booted) machines would proceed like this

  1. (client) Discover
  2. (server) Offer
  3. (client) Request
  4. (server) ACK
  5. (client) ARP for offered IP
  6. (other client) ARP Reply
  7. (client) Broadcast ARP Reply
  8. Repeat 1-7
  9. (client) Blank DHCP Request
  10. (server) NAK
  11. Repeat 9-10 until client gives up (long time)

The difference occurs at step 6. In this case, a WYSE terminal (1200LE) replied to the gratuitous ARP request from the client. In seeing another device on the network, the client then rebroadcast the ARP reply so others would see it, and then proceeds to request another IP address. The server tries to assign the same address to the client, seeing that it already has leased it to that client. The client then tries to request again and is sent a NAK each time. This process repeats until the client gives up.

So - why would a DHCP server try to hand out an address still in use? Because the lease time was up and the device did not renew during the lease time. Normal server operation is to delete a lease when it expires. Why would a client not renew it’s lease? I’m not sure. I’ve contacted WYSE to find out why the device doesn’t just renew it’s address instead of requiring a restart when the lease is up. No response yet. There’s even an option in the WYSE config files to choose whether to restart or shut down the device when the lease expires. The restarting isn’t really the issue though. When the devices are left on, they seem to go into a standby mode until woken up by mouse, keyboard or pressing the power button. When the device wakes, it presents the prompt “The dhcp lease has expired. You must restart.” Unfortunately, when in the sleep state, the devices respond to ARP but not pings. Windows DHCP servers use ping to test for collisions.

So who is at fault here? I’m not sure. I am going to read the DHCP spec and try to figure it out. Mainly because I want to know who to blame. If you have any ideas please share.

 Joel

About joelgibby
Twitter It sometimes makes me laugh Or cry Splash

Comments

5 Responses to “WYSE terminal causing DHCP issues”
  1. Rob D says:

    Joel, nice post on sniffing DHCP. Sounds like it is the WYSE terminal causing the problem. Did you ever nail it down?

    -Rob

  2. Reuben says:

    Hey Joel,

    I’m having the same issue at a clients site. Did you ever solve this?

  3. joel says:

    Hi Reuben,

    It looks like the main issue is that the WYSE firmware, rather than renew the lease or release it, just has the option of rebooting or shutting down. The other issue is that if the system is in a low power mode, it seems that the reboot does not happen. I went ahead and specified:

    DhcpExpire=reboot

    in our WINOS.INI file and extended our DHCP lease time to 8 days (I know this won’t work for a lot of people) which seems to have worked for us as by the time 8 days are up it’s likely a user has walked up to the terminal and seen the message to restart. The other option might be to disable the screensaver. I would like to build a test lab and set the lease to a short time to really test things though. Let us know if you find anything else that works!

    Joel

  4. ali says:

    i seem to get similar dhcp issues at a client location. The thin client is forced to reboot after getting a ‘dhcp lease expired’ message which seems to happen at regular intervals (every two hours in my case). Wyse 1200LE is the thin client model with upgraded firmware.

    the thin client was directly connected to the dsl model, so i had a router put in between the dsl model and the thin client and that seemed to ‘hide’ the problem. No more ‘dhcp lease expiration’ messages, and hence no more reboots, on the thin client but the thin client still freezes or lockouts and disconnects without any reasons a couple of times a day because it seems like now the router is dealing with the dhcp problem. So I am probably guessing there is nothing wrong with the thin client dhcp settings but rather with the DSL model which seems to be expiring its dhcp lease every two hours.

    i am beginning to think this is an ISP issue and debating if i should ask the user to switch isps.

    anybody else facing these issues and found any viable solution?

  5. J says:

    Same problem here as it seems, using V10L thin clients with newest firmware. Strange thing is that after rebooting the DHCP server everything works fine for about a week (using a 2008 R2 server as DC with DHCP).
    Any news so far about that topic?

Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!