1 Comment SANbox - 02/10/10

Picture of my custom Case

Each power supply powers one of the mainboards, and two of the fans (1 in/1 out). Prior to putting the door on the case I’m going to add a shroud onto each of the IN fans, to direct air onto the motherboards.

I mounted the boards so that all of the connectors are facing forward for ease of access.

The systems will be direct connected by a Fiber connection and a copper connection. The fiber connection will be through the Qlogic HBA that will be running a driver with TCP support for the DRBD replication. The copper connection will be an additional ring for the cluster stack to communicate.

Each system will also have a fiber and copper connection that leaves the box and plugs into the appropriate switch.

I haven’t decided how I’m going to install the base storage of 4 disks (with plans to expand to 8 soon). I’d like to have them in the case, but I’m concerned things might get to crowded – so I may build a small box and run them external. I just have to make sure the SATA runs stay less than 1m.

Here’s a picture of the case I’ve built to house the computing end of my home SAN. The lower two fans pull air into the case, the upper fans pull air out.

Comment How-to: Delayed - 02/2/10

There will be a day or two delay until the next part of the how-to – last night I cooked my test setup, so I am rebuilding it and trying to figure out what I did wrong.

The next installment will cover configuring DRBD in an dual-primary configuration, monitored by pacemaker.

ETA: Thursday

Comment Directory Services Restore mode is my hero - 01/22/10

Have you ever been at work or at home working on a computer, entered a command (or otherwise take a critical action) – and as your finger is releasing pressure off of the <enter> key, or the mouse button you suddenly enter a cold sweat, realize what you’ve just done – and suddenly have a horrendous urge to reach for a garbage can to vomit?

I just had one of those moments. I rebuilt my ESX server, upgrading it from 3.5 to ESXi 4.0 U1. That went smooth, I presented my iSCSI LUNS – all smooth.

I than upgraded the virtual hardware on my domain controller. Then I realized that it’s Server 2008 Enterprise edition, then I recalled that I have AD installed on the D: drive, and finally I remembered this KB article. Essentially, ESX 4 presents VMDKs to systems as a SAN device, which by default Enterprise edition set Offline, until instructed otherwise.

Welcome to a BSOD loop on my only domain controller.

Fortunately, the system successfully booted to Directory Services Restore Mode – which allowed me to online the D: disk, and successfully reboot my system.

Comment How-to: Active/Active iSCSI + VMware (Part 1) - 01/20/10

As promised earlier, here is the first installment of the how-to. Before we get going to far, this series of how-to’s has the following disclaimers

  • A basic knowledge of Linux is assumed. I’ll provide the commands to perform certain activities, but I’ll assume you know how to get a basic Linux install going
  • This how-to is written with CentOS 5.4 in mind – any distro will do, but you may need to modify some commands to make it happen.
  • This how-to uses virtual machines and is the result of proofing out the concept. I’m sure that when I build the final servers – I’ll fine tune it a bit more (which will result in an addendum to the how-to :) )
  • This how-to will be done in a progressive fashion, each part will layer another level functionality of onto the configuration.

In Part 1 of the how-to, we will complete the create of two Virtual Machines, install Linux, VMtools, needed storage space, and required software. At the end of this part, you will have two functional virtual machines that will replicate storage between themselves in a Primary/Secondary fashion, and you will be able to share out the LUN from the primary node via iSCSI.

0. Most commands below need to be execute with root privilege

  1. Create two virtual machines. I created them on my desktop using Virtual Workstation. I created the machines with two Network Cards, one of which was bridged over the desktops NIC, the other on vmnet0 (or any other non-bridge network) – to provide a private network connection for the two storage nodes. Additional – I removed all fluff hardware from the VM (USB, Floppy, etc.) I attached an 2.5 GB thin provisioned HDD to each machine.
  2. Install Linux – I did this by doing a netinstall, and selecting all packages from the install process. This means I will need to add everything to the system. The output of df -h on a completed system is below
    • /dev/sda2 1.5G 1.1G 325M 78% /
      /dev/sda4 396M 34M 342M 9% /var
      /dev/sda1 99M 17M 78M 18% /boot
      tmpfs 189M 0 189M 0% /dev/shm
    • Configure eth0 to communicate on your public network
    • Configure eth1 to communicate on the private VM only network
  3. I disabled selinux, I realize this may be a debatable action, however I just don’t understand it enough, and I want to eliminate it as a potential problem.
    • vi /etc/selinux/config
    • change enabled to disabled
  4. optional - Next we’re going to remove a few pieces of software that I don’t want on the system -
    • echo y | yum remove iptables
    • echo y | yum remove cups-libs
  5. Next we are going to install perl and than upgrade the system
    • yum install perl
    • echo y | yum upgrade
    • reboot
  6. Install VMtools via your preferred choice – once you have installed it besure to clean up the install files as they take up a lot of hard drive space
  7. Now we are going to install all of the software for the core functionality of the system
    • echo y | yum install mdadm
    • echo y | yum install kmod-drbd83
    • echo y | yum install wget
    • echo y | yum install make
    • echo y | yum install gcc
    • echo y | yum install openssl-devel
    • echo y | yum install kernel-devel
    • echo y | yum install patch
  8. Download and install ietd
  9. At this point – add an additional two disks two your VM (reboot or rescan the scsi bus to find them)
  10. Let’s create a single raid 0 array on the two disks we just added
    • mdadm –create /dev/md/d0 –auto=mdp –level=0 –raid-devices=2 /dev/sdb /dev/sdc
    • mdadm –detail –scan >> /etc/mdadm.conf
    • add DEVICE /dev/sd*to mdadm.conf as the first line of the file
  11. Step 10 creates a software raid array that we can partition. Making the array is very important for future expansion of the array, without disrupting the drbd resource that is on the array.
    • This will be evident down the road when I walk us through dynamically adding additional space to our storage system.
  12. Now let’s partition the new array
    1. fdisk /dev/md/d0
    2. n – command to create a new partition
    3. p – create it as a primary partition
    4. 1 – partition number
    5. <enter> – begin at the beginning of the disk
    6. <enter> – end at the end of the disk
    7. w – write the partition table and exit
  13. It’s time to create your drbd.conf file. You can scour the internet and make your own, or you can download mineand modify it to match your hostnames/IP config
  14. Create the drbd resource meta data
    • drbdadm create-md r0
  15. At this point you should either clone this VM and modify it appropriately for the second node, or repeat all of the above steps on the second VM. Do as you feel comfortable

At this point we have two identically configured VMs, that have a raid 0 array, drbd installed and configured to write to that array. The next step is to fire up drbdfor the first time and perform the initial sync between the nodes

  1. Before we go any further, let’s set both drbd and iscsi-target to manual startup – this is annoying for a reboot, and makes it unusable in a real sense – but it allows us total control of our test scenarios
    • chkconfig –level 0123456 drbd off
    • chkconfig –level 0123456 iscsi-target off
  2. In step 13 above, you should have created your drbd.conf file. Going forward I will refer to the nodes as I have them named in my example file. kroker01 is the primary node with kroker02 the secondary.
  3. On kroker01 start drbd up and tell it to perform a sync to kroker02 (when it comes online)
    • service drbd start
    • drbdadm — –overwrite-data-of-peer primary r0
  4. Check out /proc/drbd to verify that kroker01 is in a WFConnection state, and is the primary
    • 1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r—-
      ns:10372472 nr:0 dw:0 dr:10380564 al:0 bm:633 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
  5. Go ahead and start drbd on the second node; at this point if you cat/proc/drbd you should see connection status as Connected and ro: as Primary/Secondary, with ds:UpToDate/Inconsistent. You should also see a line in /proc/drbd detailing the status of the resource. At this point we are safe to direct disk traffic to /dev/drbd1!
    • Please note
  6. Setup ietd.conf
    • Again, google is your friend or download my example, making any needed modifications for hostname and the like.
    • service iscsi-target start
  7. From any box that has an iscsi-initiator, point it towards the IP/hostname of the primary box (kroker01).
  8. After performing a disk rescan on the client host, you should now have a new disk to work with and format.

Play around with stopping and restart drbd on the secondary node, and observe what you see in /var/log/messages, and /proc/drbd. As the secondary node comes up and down – drbd tells you various information about what it is doing and the status of the secondary disk.

What can I do with this?

Honestly – at this point we have an expensive, highly redundant, but manual raid 1 disk array for you to play with. Should your primary node fail, you will have a block for block copy of your data on a second node, that you could pretty easily bring up as the primary node to access your data.

2 Comments Home ESX Infrastructure version 3.0 - 01/19/10

So I’ve been talking bragging to my co-workers and friends about my under-construction ESX environment at home.

Currently my environment is pretty simple to consists of a generic system running CentOS 5.4 farming out a few LUNS to my ESX server.

  • Pentium 4 2.8 ghz
  • Abit IS7-e w/ 1.5 GB RAM
  • Dual Port Intel E1000 Nic
  • LSILogic MegaRaid 150-4 Sata controller
  • 3×250 GB SATA disks in a Raid 5
  • A couple misc IDE drives

I have centOS installed onto a 2.5 GB partition, the remainder of the disk (approximately 800 GB usable total) is presented as iSCSI targets using ietd 1.4.19.

As a side note, until just a few days ago this box was running Openfiler 2.3 with a P3 700 + Soyo SY-7VCA2, but i started having some stability problems with the motherboard/cpu and a MB/CPU swap seems to have made things better.

Said ESX box is currently running 3.5 with some decent hardware.

  • Rioworks/Accelertech/Arima HDAMA Rev. G
  • 2xOpteron 248 ( Rev E.)
  • 8×1 GB PC1600R Dimms

The ESX box and CentOS system are connected via a crossover cable for iSCSI traffic.

Prior to the split setup between my ESX server and CentOS system, my storage (via the megaraid 150-4) card existed in my ESX box, and due to the need for more storage but not wanting to buy another ESX HCL listed SATA/SCSI card and the needed drives. By moving it all to an iSCSI system, I removed the need for having to use only certain types of drives.

So what’s next? I’m glad you asked – the Bowe ESX farm v3.0 will be a 100% availability infrastructure. Well not truly 100% – I will have some limitations due to internal house electricity, but the [eventual] purchase of appropriately sized UPS capacity will solve that problem.

How do I intend to accomplish this – lots of used hardware. Some of the hardware I have sitting around from previous spending binges, others have been acquired or will be acquired over the next few weeks via careful ebay shopping. I will list the price I paid, plan to pay, or would expect to pay as appropriate for my purchase situation (got, getting, had).

ESXi (2x)

  • Tyan K8SR (paid 27.50 ea. shipped)
  • Dual Opteron 270 (budget 40 per pair shipped – ebay)
  • 8x 1GB PC1600R (would pay $10-$15/dimm – bought a bunch of hese a LONG time ago)
  • 1x Emulex 9802 HBA (paid $5 shipped – ebay)
  • 1x Tyan OOB mangement card (freebie with the K8SR)

Storage (2x)

  • Rioworks/Accelertech/Arima HDAMA (rev prior to G) (paid $20 ea. on ebay)
  • Dual Opteron 246HE (freebie CPUs that came with the K8SRs in my host setup.)
  • 2x 1 GB PC1600R (see above for pricing)
  • 1x Qlogic 2340 HBA (paid 11.50 ea. – ebay)
  • 1x Emulex 9802 HBA (paid $5 shipped – ebay)
  • Generic SATA Controller ($20 ea.)
  • 2×250 GB Sata drives (bought a while ago, but market rate is ~$35 ea.)

All systems will be booting off a 2.5 GB Compact flash microdrive in an IDE adapter – ~50 dollars for 4 drives and adapters on ebay. All systems also have power 350/400 watt power supplies. Two of which came with the HDAMA MBs in the storage boxes, the other three of which I have on hand, but I may replace with a Sparkle FSP350-601u – which can be had on ebay for less than 20 bucks

Infrastructure

  • 16+ port Gigabit managed switch, that supports VLAN tagging (I’ve seen some of these on ebay for ~$50 in the past few weeks. I am budgeting approximately $75 + shipping)
  • 8+ port 2 GB Fibrechannel switch (budget is $50 + shipping on ebay)
  • GBics (market seems to be $5-$10 ea. on eBay)

So the total infrastructure cost is less than $800 – less if you already own some of the hardware.

Software stack

Storage

  • SCST will be installed on each node to load an Emulex FC target driver to share out disk resources on the SAN Fabric
  • drbd 8.3 (in dual primary configuration) will be installed to perform replication of the disk between the two storage nodes
    • drbd will be using the Qlogic HBAs using an older driver with TCP/IP support for replication
  • pacemaker will be installed (most likely) to help control drbd and to control split-brain and act as STONITH
  • The 250 GB drives will be configured in a software raid 0.
    • Choosing to do a software raid removes dependencies on hardware raid controllers and it also will allow me to effectively scale the arrays outward by simply adding more drives.

ESX – I will be using ESXi 4.

Misc

  • My current ESX hardware will be repurposed as a physical Forefront Security Gateway 2010 (yeah Technet subscription) system. One NIC into the cable modem, the other NIC into the Gig switch with all VLANS trunked to it

So what does this give me? Once this is built out I will have a fully redundant ESX farm. I will be able to power down either ESX server or either storage server for patching, maintenance, etc without taking down my virtual machines. The only box at “risk” will be the ISA system.

At some point I’ll drop in appropriately sized UPS system(s) to provide 5-10 minutes+ of backup, although this looks like a pretty sweet solution.

Of the above environment described the only pieces I am missing are the Opteron 270s, the Fibre Switch (and Gbics, although I’m going to try to get them both in one auction if possible), and the gig network switch. Depending on ebay availiblity I’m looking to have all of the hardware acquired in the next month or so and the entire environment built out shortly after that – although once a few packages arrive this week I should be able to start the actual build out

What do I know about this setup – DRBD works. I’ve having a blast playing with various failure scenarios and split brain detection. I feel like I have a setup that is very reliable at picking the right “master” to start sync from in a failure state, but I am going to start looking at pacemaker a bit to see if it makes building some logic into the setup easier. Otherwise I will probably just code some rough bash scripts to control start-up of DRBD/SCST.

I have a fair bit to learn about the details Fibre Channel, but I’m looking forward to the challenge.

As I said earlier – once I start the build out, or have finalized my configuration – I’ll post a detailed how-to and possible sanitized VMs to work with.

Comment I haven’t forgotten - 12/30/09

I haven’t forgotten about my promise to post a how-to on doing an Active/Active iSCSI VMware setup.

I’m in the process of aquiring all of the hardware I need and will do the write up as I’m installing and configuring that environment.

ETA is mid January. If you would like a copy of my notes in the mean time please visit the contact page and drop me a note (or leave a comment).

Comment How-to: Active/Active iSCSI + VMware (Intro) - 12/3/09

Over the next few weeks I’ll be writing a how-to for setting up an Active/Active iSCSI storage solution for VMware ESX.

My solution will entail:

2xiSCSI servers running CentOS 5.x, DRBD 8.3, ietd 1.4.19

2xVMware ESXi Boxes

vCenter

The goal is to setup a infrastructure that will survive everything except a simeloutanous power ouage at my house. In theory I should be able to take either any of the boxes down for maintenance without having to shut my VMs down. That is probably my biggest annoyance about my setup at home – the need to shutdown everything down to patch.

I’m in the process of constructing the environment with VMs to proof it out.

I am pretty confident that this will work – I just want to proof it out before dropping $$$ on storage.

Inspiration for this:

http://blog.core-it.com.au/?p=62

Comment Call for info - 10/13/09

Does anyone have any experience/thoughts/ideas/insight into deploying Server 2008 leveraging a tool like sudowin, or using UAC (or something) to remove the need for full admin access for application owners?

If so, please comment or visit the contact page.

Comment Snapshots and Disk Expansions – followup - 04/7/09

Here is a good follow up on my post on snapshots and disk expansions

The post references the following VMware KB Articles

VMware KB 1007849

VMware KB 1004232

Comment Server Upgrade - 03/27/09

Last night I migrated this website to a new web server. Well I wish it was just that:
new “physical” server
CentOS4->CentOS5
Apache upgrade
mySQL upgrade
php upgrade

If you notice anything out of place, please drop me a note.

Bear