Quality Control @ Dell WPD

Our office participates in the Dell Warranty Parts Direct service program. Basically it allows us to be “self-maintainers” and order our own warranty parts as needed rather than having to wait on hold with support to go through the same troubleshooting steps we’ve already completed. It’s usually an efficient way to get bad parts replaced under warranty. One of my coworkers ordered a motherboard last week. Today when he opened the antistatic bag he got an unwelcome surprise:

The Intel ICH7 chip and surrounding components were heat damaged and filthy. There was also what looked like burned thermal grease all over the chip (suggesting that a heat sink may have been installed, though the original failed board in our system had no heat sink and the chip was as clean as it was from the factory). Click the image to get a larger view. 

Flipping the board over revealed additional damage to the contacts for the various components installed around the chip:

Burned PCB Contacts

As a disclaimer, we’ve never seen damage like this before in warranty parts sent to us. In fact, most parts are in great shape and some even have that “factory new” smell to them. This instance though, seems like either laziness or incompetence. I don’t know how something this bad could have made it through any quality control process. Visual inspection should have been the first clue. It’s not like it’s just dusty. IT’S SCORCHED. I also don’t know how much difference blogging about it will make, but at least it’s documented out there now. Dell, what’s up?

Two DRAC III / ERA Issues and solutions

While working with some DRACs (Dell Remote Access Controller) today I was able to figure out a few issues that have been giving me trouble for quite some time:

DRAC / ERA internal address registered in DNS on Domain Controller

DRACs installed on Domain Controllers were registering their RAC PPP connection in DNS with the hostname of the computer the DRAC is installed in. This creates a problem for clients looking in DNS for a domain controller - they get an address that is either non-routable (192.168.234.235) or doesn’t respond at all. For most systems, you can simply uncheck “Register this connection’s address in DNS” in the DNS tab under Advanced options for the connection. Windows Server 2003 SP1 installed as a domain controller however ignores this setting and continues to register the address in DNS. There is a hotfix (included in Windows Server 2003 Service Pack 2) that addresses this issue, but you have to call Microsoft to get it (I think I’ll just install the service pack).

http://support.microsoft.com/kb/832478

Thanks to Neal’s Admin Notes:

Dual NIC Problems with NetLogon and DNS

DRAC / ERA Console Redirection Fails with Warning – Reintstall PPP Connection

The other issue was that the graphical console redirection on some of my ERA (Embedded Remote Access) devices (basically an embedded DRAC III) was not working. The console window would launch, but I would get no video and the message: “Warning: Remote Console is not available” The first step is to wait a few minutes (sometimes it takes awhile to initialize). Check. If it’s still a no-go, check the RAC services (Remote Access Connection Service and RAC VNC Service) and restart them if necessary. Check. Still no-go. Then I found this gem (note command is run from C:\Program Files\Dell\SysMgt\RAC in default Windows installs):

Root Cause: 

During RAC3 installation, the modem is disabled and enabled for the default name of the modem to be changed to RAC PPP connection Using RACPORT after modem driver installation. The OS fails to recognize the modem name change and the installer code is not able to find modem device to establish the connection.

Solution: 

From the command prompt run the command installppp createRacConnection. The following message will be displayed confirming that the installation was successful: 

Installing PPP connection
Successfully Installed the RAC connection

Here: http://support.dell.com/support/edocs/stor-sys/spv745N/en/RN/RelNotes.htm (it’s an obscure Release Notes for a Dell NAS device). The instructions fixed the issue immediately – no reboot required. Kinda bizarre, but as long as it’s working, right?

Dell PowerConnect 3024/48/5012 Password Reset

Found this on the Dell Forums:

For the 3024/3048/5012 products:

1. Connect to the switch via the console port and manually reboot the switch
2. As soon as power is applied, press and hold the ESC key
3. At the command prompt, type “EmergencyPasswordReset” (case-sensitive without the quotation marks)
4. At the confirm (Y/N) prompt, type X
5. If done properly, you will receive a message stating that the password has been disabled
6. Type G and hit enter to reboot the switch

The switch will reboot with the password disabled.

This saved us in a pinch. Just goes to show that Physical Security is still the first and most important security.

Disable Chassis Intrusion detection

After deploying Dell OMCI to about 600 desktops and portables, an alert began displaying upon user logon:

Dell OMCI Chassis Intrusion Alkert

Needless to say, users were somewhat confused by this.

To get rid of the message, we either had to:

  1. Run around to every Dell PC in the organization
  2. Uninstall Dell OMCI
  3. Remotely disable chassis intrusion detection and clear any current detections

Obviously we chose option three. With Dell’s OpenManage IT Assistant software, I was able to build a remote CIM command line to execute on a set of systems (in our case any system that was reporting a status of degraded). Here’s the command we ran:

system cim action=setcim ipaddress=$IP username=$USERNAME password=$PASSWORD authenticationlevel=packet classpropertyvalue=Dell_SMBIOSsettings::ChassisIntrusion:4

To execute the command, I setup a new command line task in ITA, targeted at a query of computers whose status was not “OK.” I set this to run once an hour, since clients were still being discovered and inventoried as this was happening. By setting the query to only hit degraded clents, we avoided running this needlessly on clients already configured properly.

Dell OpenManage Documentation

Dell’s Systems Management platform has been around for quite awhile. It has recently been getting more and more usable and comprehensive. While attempting to test and implement it in our achitecture, I have scrolled through pages and pages of documentation, forum threads, and configuration pages. Here are the pages that have been the most useful for me:

Dell OpneManage Client Instrumentation User’s Guide

Client Instrumentation Documentation

Dell OpenManage Server Administrator User’s Guide

Server Administrator Documentation

Dell OpenManage Client Connector User’s Guide

Client Connector Documentation

Dell OpenManage IT Assistant 7.2 User’s Guide

ITAssistant Documentation

Here’s the platform in a nutshell: The Client Instrumentation (or Server Administrator for servers) is installed on the managed system and interfaces with the system’s data providers (BIOS, Disks, Chassis, etc), and makes this data accessible through the standard WMI interface. On the Management Station (usually a server set aside for Systems Management), the IT Assistant and Client Connector allow administrators to access and manage the systems and their configurations through the WMI Interface. IT Assistant is the main management application (one to many) that provides an overall view of system health and status and Client Connector (one to one) can be launched from IT Assistant to manage an individual client. Server Administrator can also be launched from within IT Assistant for individual Server Management, though it runs on the managed server rather than on the management station.

Some nutshell eh?