Unmodified Xbox Cluster  ![]() |
![]() |
|
Paper on Xbox Cluster
- This was submitted to the ICPADS 2004 conference, but was not
accepted for publication. The Microsoft Xbox game console has an Intel Celeron 733 MHz processor (16K L1, 128K L2, 133 MHz FSB), 64 MB RAM, 8 or 10 GB of harddisk space, a 10/100 Megabit Ethernet adapter, a DVD/CD drive, four (non-standard) USB ports, and a 130 Watt power supply. New consoles can be bought in in the United States for $180. Benchmark results: High Performance Linpack (HPL), 540 Megaflops/sec (single node), 1.4 Gigaflops/sec (four-node cluster). |
Objective: Build an unmodified Xbox cluster. This means that the nodes in the cluster all are unopened: A mod-chip was never installed. The harddisk and memory have never been upgraded. Software-only exploits are to be used. Hot-swaps should be avoided.
| System | Microsoft Xbox | Sony Playstation 2 |
| CPU | Intel Celeron/P3 733 MHz (133 MHz FSB) 16 KB L1 cache, 128 KB L2 cache Supports MMX and SSE instructions |
"Emotion Engine", 300 MHz MIPS R3000 compatible Full MIPS-3 instruction set with extensions from MIPS-4 and -5 Two Vector Processing Units (VPU) |
| Memory | 64 MB DDR SDRAM | 32 MB RDRAM |
| Network | 100 Megabit/sec | Sold* with Sony Linux Kit Not (100 Megabit) |
| Harddisk | 8 or 10 GB | Sold* with Sony Linux Kit (40 GB) |
| Removable media  | DVD-ROM/CD-ROM | DVD-ROM/CD-ROM |
| Power supply | 96 or 130 Watts | ? |
| Special features | Front-panel programmable color LED (Red, green, orange, off, blinking, etc.) Temperature sensors |
N/A |
| Ports | Four USB (non-standard connectors, requires special cable if you want to use them in Linux) |
Firewire (not accessible from Linux) |
| Console Cost | US$180 | US$180 |
| Extra Costs* | N/A | US$200 Sony Linux Kit (40 GB HDD, Ethernet
module) US$25 (Sony 8 MB memory card required for Linux) |
| Total Cost | US$180 | US$405 |
| 7/08/2003 | Purchased my first Xbox. |
| 7/10/2003 | Successfully loaded BusyBox Linux into the Mega-X-Key USB memory card and used the 007 exploit to boot that distro on the Xbox. | 7/13/2003 | Ed's Debian loaded on Node 1 (using 007 exploit, DVD must remain in the tray). | 7/14/2003 | A huge mistake was made when installing the Bert and Ernie Reloaded font exploit. The default.xbe file was not signed with the "xbedump -font" prior to rebooting. This killed Node 1 with an Error 21 service screen. | 8/02/2003 | Node 1 had to be physically opened and a hot-swap procedure was used to fix the mistake. XLinux was burned from its ISO image to a CD-R using Nero on my XP laptop. XLinux boots on a regular PC and can read and write a FATX partition. I used this to put the correct signed file on the C: drive. I was unable to get HDD_Driver to work on Windows XP. Node 1 was repaired and I could boot game DVD's again. | 8/03/2003 | Succesfully got the font exploit to work--the 007 DVD no longer needed to be in the tray! | 8/17/2003 |
Purchased second Xbox. Node 2 placed into service.
Removed the discard, daytime, time, smtp, ident daemons.
Configured the NTP time servers, and set the proper timezone. Succesfully
developed a "clone" procedure that allows me to build nodes and have them online in less
than 30 minutes after removing the plastic wrap from new console out of its cardboard box.
Only two files must be manually
modified after the clone procedure (/etc/hostname and
/etc/network/interfaces to select the new node's hostname and IP address).
Node 2 is undergoing testing with prime95's "torture test".
|
8/18/2003 |
Removed additional unnecessary daemons (inetd, lpd, apt-proxy). Received e-mail from
Dr. Paris indicating he had a thesis advising slot open and was interested in this project.
Discovered that
lm_sensors was already installed in Ed's Debian. Running "sensors -f"
will report the motherboard and CPU temperatures in degrees Fahrenheit. Discovered
that the CPU temp goes above the maximum by a few degrees during the prime95 torture
test. Installed new version of prime95 (sprime235). Ran the benchmark test to see
how well the Xbox compares with "real" computers.
|
8/19/2003 |
Removed the "/sbin/hwclock --systohc" line I had previously inserted into the
/etc/init.d/ntpdate file. I originally put this file in so that the actual
hardware clock would get reset with the time from the NTP server, but this was apparently
screwing up Node 2 and caused it to go into a loop whenever it was rebooted. Node 1
did not appear to be effected, but since I want the nodes to be the same, I went ahead
and removed that line from both nodes. The Bert and Ernie Reloaded font exploit always
sets the HW clock to 7/4/2003, but if the internet is up, NTP should be able to get the
real time and update the Linux clock. I am starting to get concerned about the temperatures
reported by sensors. I've often seen 144 deg F when the maximum is supposed
to be 140 deg F
for the CPU. I had one node stacked on top of the other, but it was sort of wobbly. I
bought a set of four rubber-like pencil erasers, and placed each on the corner of the
bottom Xbox. These pencil erasers hold the top Xbox much more firmly in place and also
have the side effect of giving about a centimeter of clearance between the two boxes.
This may improve the temperature situation a little. I set mprime to
automatically start when the box is rebooted by adding "mprime -B" to the
/etc/init.d/bootmisc.sh file.
|
8/20/2003 | Purchased third Xbox. Node 3 placed into service. | 8/23/2003 |
Purchased forth Xbox. Node 4 placed into service. The first three Xbox's had been
purchased from Circuit City. This one was purchased from Best Buy. It must have been
from an earlier batch because the Dashboard did not have the "Xbox Live" option
available (i.e., it was Dashboard version 4817). Having read that the exploits do
not always work on the older Dashboards, I went back to the store to purhcase any
Xbox Live-enabled game. The Live-enabled game will automatically update the Dashboard
version once you get to the "New Account" screen. I purchased "Mech Assault" since I
may be able to experiment with it later on with the newer exploits that have been
coming out. Remember that the 007: Agent Under Fire is not a Live-enabled game.
Anyway, I updated the Dashboard to 4920, and then used the same old 007: AUF exploit to
install the font exploit and Ed's Debian just like on Node 2 and 3. Discovered that
/sbin/xbox_tool bundled with Ed's Debian can display the HDD password.
|
8/24/2003 |
Worked on getting ssh configured so that I could rsh commands to the other nodes
from Node 1 without needing a password. Required learning how to use
ssh-keygen and the reason for the ~/.ssh/authorized_keys
file. Modified the /etc/hosts file to add hostnames xbox101, xbox102,
etc. I became tired of typing out the IP addresses all the time. Modified all four
nodes to join the GIMPS project.
All four machines are factoring rather than running primality tests because the
latest version of the mprime software puts the primality testing cutoff at 900 MHz.
|
8/29/2003 |
Web surfing to do research on MPICH (a free implementation of MPI for Linux). Did
not actually install it. Planning to install it with the ch_p4mpd
"device". This device is to be used only in homogenous uniprocessor clusters (since
all my nodes will be Xbox's, that is about as homogenous as you can get. Installed
the "unsupported non-commericial" version of the Intel C/C++ and Fortran compilers
for Linux. The install required rpm (it was taylored for Redhat), so I actually
installed it to a Redhat machine, tared up the /opt/intel directory, and then moved
it over to Node 3 for testing. I discovered that Ed's Debian 0.31 is using the
2.2.5 version of libc (run /lib/libc.so.6 to get the version). I made
cc point to icc (Intel C/C++ compiler) and f77 to point to ifc (Intel Fortran). I
modified /etc/profile to include required environment variables (such as
INTEL_LICENSE_FILE).
|
8/30/2003 |
Begin the install of MPICH on Nodes 3 and 4. An /etc/mpd.conf file was setup with the
MPI world password. To run MPI programs, it was discovered that the program and its data has to
be visible from both nodes, not just the node that launches the MPI application. This was cumbersome
in my setup to-date because I was manually ftp'ing files back and forth to keep things in sync.
In order to fix this problem, NFS needed to be used. I configured Node 1 as the NFS server, and
Nodes 2, 3, and 4 as NFS files. I exported the /opt and /home trees from Node 1. Now, no matter
which node I logged onto, I could access the Intel compilers, MPICH software, and files out of my
home directory. I no longer had to ftp files from my home directory on one node to another. This
makes managing the cluster much simplier. In the future, I will probably need to create user
accounts for other people, so I went ahead and set up NIS (Network Information Service). Node 1
again was selected as the server, and the other nodes were configured as NIS clients. This allows
Node 1 to manage all the user accounts, and I can change my password there (using yppasswd)
and the change will be synchronized across all nodes in the cluster. I discovered that "YP" stands for
"Yellow Pages". Node 1 is actually configured as both a NIS server and client. To make the other
nodes NFS clients, the /etc/fstab file was modified on the client nodes. To allow the
NIS clients to share user accounts with the server, the following files had to be edited:
/etc/yp.conf, /etc/passwd, /etc/shadow, /etc/group. The yp.conf file just
specified the IP address of the NIS server. The other files had special symbols added to them, and
the other local user accounts (except for root) were removed.
|
9/1/2003 | Installed the ATLAS numerical libraries. First install was a pre-compiled version tailored for the Pentium 3 with 256KB L2 cache. I also installed the GOTO numerical libraries, but again, they were for machines with 256KB L2 cache. A custom version of ATLAS was build. The nice thing about ATLAS is that it is "self-configuring" and tunes itself for the specific machine it gets built on. The tuning process means that the build takes a very long time. HPL (High Performance Linpack) was run using MPICH and I measured a maximum performance using all 4 nodes of 1.4 Gigaflop/sec. This puts my 4-node cluster at a price-performance of 51 US-cents per Megaflop/sec. | 9/6/2003 | Found and compiled the source code for blink.c, a utility program for Xbox Linux that lets you have full control over the front-panel LED. The LED can be turned off, can be set green, orange, or red, or can be configured to continously switch between any of these states. A script was found that can be run as a cron job to monitor system load every ten minutes and change the LED status to represent load. I have not yet implemented this as a cron job, but may do so in the future. Other possibilities include changing the LED color depending upon measured CPU temperature. Recompiled ATLAS libraries with different compiler flags (using specific Intel compiler flags to optimize for the Pentium 3 and below). Re-running HPL on a single processor to see if I can do any better than 540 Megaflops/sec. | 9/14/2003 | Reports are starting to come in that Microsoft has released a new version of the Dashboard that plugs the holes the font exploits where using. If you connect to Xbox Live, apparently Microsoft will automatically upgrade your Xbox without your permission. | 9/16/2003 |
Found the source code to the fanctl program for the Gentoox Xbox Linux
distribution. I compiled this code under Debian and was able to use it (as root only).
The program lets you control the power supply fan speed. By default, the Xbox only runs
the fan at 20% speed. This has been the reason I have seen all the high temperatures.
You can configure this program to adjust the fan speed depending upon the CPU temperature.
I have a cron job that runs every 15 minutes and will set the speed to low (20%) if the
temperature is 104F or less, medium (50%) if between 104F-136F, and high (100%) if greater
than 136F.
|
9/17/2003 | Installed Linux on Babu's Xbox at school. His machine was an older Xbox that had the original Dashboard. I had to use Mech Assault to upgrade his Dashboard to the Xbox Live-enabled version. We made sure that the Xbox was not connected to the internet when upgrading the Dashboard because we did not want to accidentally pick up the new exploit-unfriendly Dashboard. He is the Advanced OS class with me this semester and will be doing his project on Grid computing using Xbox'es. Once we get his machine configured, we will be able to run some benchmarks using MPICH-G. We will have to find embarrassingly parallel machines for this to work though since the communications link between my four nodes and his node will go over the internet. Both of us have a broadband cable modem connection using Time Warner. | 9/17/2003 |
Found a Unmodified Xbox Wiki and added the link to the bottom of the page.
It contains very useful information. One of the techniques described
described how to add extra "partitions". Before today, each node had
a 2 GB root filesystem. Now, on Node 1, I added another 2 GB file system
and moved all the home directories to it. Recall /home is exported to all
the other nodes via NFS, so I did not need to do the same procedure on any
of the other nodes. The extra space gives a little more breathing room
for Linux. Both the original and the new "partition" were formatted using
the reiserfs file system instead of ext2 or
ext3. When I tried to reboot Node 1 to ensure it could
survive a reboot and re-mount the new 2 GB /home drive, a problem occured
and I was unable to ssh in. I had to connect Node 1 back to the TV and
log in locally using the USB keyboard. I fixed the mistake I made in
/etc/fstab, and rebooted. All was well, and I moved the box
back to its stack. I installed Java 1.4.2 in the new /opt/java tree.
This was needed because I need to install ant (which uses Java) in order
to install the latest Global toolkit (for Grid computing).
|
Hardware used: Four Xbox consoles. CyberPower 1500 AWR UPS. Linksys Etherfast 4124 24-port 100 Megabit switch. Mega X-Key USB memory card. (For Internet connectivity: Linksys BEFSR11 EtherFast Cable/DSL Router. Toshiba PCX1000 Cable Modem.)
Software used: EA's 007 Agent Under Fire DVD for Xbox. 007 distro of Ed's Xbox Debian GNU/Linux (version 0.3.1). xbedump. MPICH. HPL. Intel C++ and Fortran compilers for Linux. ATLAS.
Exploits used: 007 Agent Under Fire. Bert and Ernie reloaded.
Ed's Debian Xbox Distro: My cluster uses the 007-ized version of Ed's 0.3.1 distro, which has the 2.4.20 Linux kernel. Ed's 0.4.1 distro is now available, but an 007-ized version is not. The 0.4.1 distro adds features that are not really needed for my cluster (such as VNC server, change of X manager to gdm from xdm, etc.) They did upgrade the Linux kernel from 2.4.20 to 2.4.21, but that isn't that big of a deal. My Redhat 9 workstation is still at 2.4.20 too.
Custom changes to Ed's Debian: Removed automatic boot into X-Windows. Since
the cluster nodes are to be used "headless", it was not necessary to have
X (xdm and xfs) running. Removed 5 out of the 6 tty getty processors that
get started (commented out the lines in /etc/inittab).
The getty's are not needed because again, the nodes are headless.
Removed various unnessary daemon's (inetd, discard, daytime, time, smtp, ident, apt-proxy, lpd).
Modified /etc/resolv.conf file to select correct DNS server.
Changed /etc/localtime
link to point to /usr/share/zoneinfo/CST6CDT.
Modified /etc/default/ntp-severs
file to use NTP servers in my time zone.
Issues: Xbox must have an A/V cable plugged into the back of the unit, or it
will not boot. The A/V cable does not need to be connected to any other device,
so this is only a minor annoyance. Major issue: Most of the time Linux reboots,
the "clock loop" familar to many Font exploiters occurs. It does not always occur,
but sometimes it can take 10-30 minutes for the Xbox to get itself out of the loop.
This happens even if the boxes have had continuous power. It must be somehow related
to something either the font exploit or Debian Linux is doing to the clock. Once
the reboot loop stops, the hardware clock is set to 7/4/2003 (Xbox Independence Day).
This is not a showstopper because Ed's Debian is configured to use
ntpdate to automatically set the time from a close-by NTP server.
New techniques emerged in 2003 which enabled the use of Linux without needing a mod-chip. The "007" (Electronic Art's "Agent Under Fire") exploit appeared first and this allowed a user to run an unsigned XBE (Xbox Executable) file by using a buffer-overun bug in the "Load Saved Game" feature. The BusyBox embedded version of Linux was modified to run on the Xbox and people were able to use it to modify files on the Xbox harddrive. A similar buffer overrun bug was later found in Microsoft's very own "Mech Assault" game.
The BusyBox linux distribution is somewhat useful, but you really need something better. Since you had access to the Xbox hard drive, you could then install other versions of Linux, such as Ed's Debian Xbox distribution.
A little later, others found a way to get access to the Xbox hard drive without needing to use either the 007 or Mech Assault exploit. They used a dangerous trick called a "hot-swap". The hot swap exploit requires opening of the Xbox case (voiding the warranty) and disconnecting the Xbox's HD (while powered!) and then connecting the IDE interface to a powered-on PC. The hard disk power connector is never detached. Why did the Xbox and PC need to be running? Because the Xbox harddisk is "locked" with a password. Each Xbox HD has a unique password. Whenever the disk is powered up, you cannot read or write to it until the IDE controller unlocks the disk with the proper password. By doing a hot-swap, the disk never realizes that it was switched computers and stays unlocked.
July 4, 2003: Xbox Independence Day -- The first Font Exploit was released to the world. Now that you had the Linux files copied to the harddisk, you still had the annoying problem of having to use the 007 or Mech Assault exploit to jump-start Linux. The font exploit fixed that problem. You could now boot directly into Linux without needing to have any game DVD's in the tray. The first font exploit was known as the "Bert and Ernie" exploit. A second was called "Bert is Cheating on Ernie". A third was called "Bert and Ernie Reloaded". Several others came after these (various exploits known as "Big Fonts"). Why so many? There still exists a critical flaw with all these font exploits. If the Xbox clock becomes corrupted (which happens way too often, since its battery lasts only a few hours unlike a real PC clock), the Xbox can get stuck in a "reboot loop" whenever it gets powered on. The various exploits have attempted to solve this "reboot loop" issue with varying degrees of success. In my opinion, none of the font exploits really fix the reboot loop issue.
Other exploits followed such as the "Audio exploit". This was a safer way to jump-start Linux without needing a game DVD in the drive, but it required the user to punch a special sequence into the XBox hand controller, and required that a music CD disc be in the DVD tray. This exploit does not suffer from the reboot loop issue.
Xavier University Computer Science Summer Research 2003
5-node Microsoft Xbox cluster using mod-chips (did it over go online?)
Llamma's Xbox Beowulf Cluster
1-node Microsoft Xbox cluster using mod-chips (more nodes planned)
NCSA and University of Illinois Computer Science Dept
70-node Sony Playstation 2 cluster (amazing what you can do with a NSF grant!)
University of Illinois Chemistry Dept
Computational chemistry codes using PS/2 vector instructions
University of Manchester Computing Dept
Implementation of Grid middleware for PS/2