The eternal fight between admins and computers

(and very often users, as well)

Archive for the ‘Fixes’ Category

*fdisk and the 1.5TB partition size limit

Posted by Vide on September 30, 2008

If you have a large volume (like a disk array or the next-generation SATA disk) and you’re trying to create a single, giant partition for whatever reason, you should know that fdisk (for DOS compatibility reasons, I suppose) cannot create partitions bigger than ~1.5TB, although it won’t throw you any error or complain So if you want to create bigger partition, use parted (or one of its frontend). The limitation applies to fdisk, cfdisk and all the *fdisk family.

EDIT: in parted you have to change the disk partition type to something like gpt or you still won’t be able to create a partition bigger than 1.5TB.
Nonetheless, once the very large partition is created, I still haven’t found a way to format it, mount it and get all my terabytes. I’m still stuck with 1.5TB. Look at this:

server:~# parted /dev/mapper/mpath1
GNU Parted 1.7.1
Using /dev/mapper/mpath1
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print
Disk /dev/mapper/mpath1: 6000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name Flags
1 17.4kB 6000GB 6000GB ext2
server:~# df -h |grep mpath1p1
/dev/mapper/mpath1p1 1.5T 5.1M 1.5T 1% /mnt/logs

I’m stuck with this. Any idea, dear lazyweb?

Posted in Fixes, Linux, Storage, Tips | Tagged: , , , | 6 Comments »

FreeBSD6 and nfsd gotchas

Posted by Vide on August 21, 2008

If you’re using FreeBSD6 as NFS server, you may find useful these quick tips related to /etc/export syntax, because otherwise you will be stuck with a generic

mountd[321]: bad exports list

in your logs.
So, what went wrong here?
One possible cause is that in your exports file you’re trying to export as shared NFS resource a symlink and not a real directory. NFS doesn’t like it at all and will simply no work.
One other, curious, glitch I found is that if you have two resource in two separate lines with the same options, the latter will fail.
Example of /etc/exports:

/path/share1
/path/share2 -network 192.168.1.0
/path/share3 -network 192.168.1.0

in this case share1 and share2 will work, while share3 won’t work and you’ll get a

mountd[321]: can't change attributes for /path/share3
mountd[321]: bad exports list line /path/share3

but if you change the network value in share3 (and only this), it will work!
Maybe there’s an explanation for this (I didn’t read all the exports(5) manpage) but anyway it’s a little bit strange.

Posted in Fixes, FreeBSD, Tips | 1 Comment »

Using Wanem to simulate a wide-area network

Posted by Vide on December 14, 2007

If you (or your company) are in the web-development business, one thing you need when testing your application, besides trying it in different browsers, is trying the user experience at differents conenction speed.

This can be achieved easily with Wanen, a live CD that enables you to create a gateway for your test computer which slows down (or intruduces errors, jitters, random disconnections) the network experience. Wanem uses the almighty iproute2 tools (more in details, tc) to accomplish its tasks.

The easiest way is to try Wanem is to boot it in a virtual environment (for example, VMWare Server, for easy management). Its use is quite straightforward, so I won’t nag you here and I’ll just invite you to read the documentation the project provides, but I let you know a couple of things/errors I found while using it (I’m going to report them to the developers as well)

  • The remote administration interface, which is basically a web page, it’s so bad written HTML that onlky works with Internet Explorer 6. Firefox will render a completely useless mess instead of the simple, plain HTML table that is supposed to be. I can’t understand how is this still possible in 2007 from people using Linux (it’s a Knoppix-based live cd!). So, be careful with the advanced settings.
  • Even if you specify a static IP address on startup (in the end it’s meant as a gateway, so DHCP is almost useless), there’ll always be a “pump” (DHCP client) process active in memory, resetting your IP from time to time, if you are in a DHCP’ed environment. To solve this, you have to do a couple of tricks because by default Wanem only gives you access to a limited control shell.
    So, acces Wanem from a remote ssh with these options:

    ssh perc@$WANEM_IP -t /bin/bash

    and enter the password you created at boot time. Now you’re in the live cd and you may try to kill the pump process. But you can’t since you don’t have enough permissions! And sudo/su ask you an inexistent root password. The solution is the “dosu” executable found in the home you’ve just entered. Resuming:

    dosu killall pump

Posted in Fixes, Linux, Networking, Tips | Tagged: , | 6 Comments »

Keepalived and TCP_CHECK problem

Posted by Vide on November 23, 2007

Today I was debugging a problem I had with keepalived not discovering that a real server behind a virtual IP it manages, had died.

The problem was really strange because the check was very, very simple

real_server 192.168.1.65 3306
{
TCP_CHECK
{
connect_port 3306
bindto 192.168.1.65
connect_timeout 2
}
}

This configuration was created after reading keepalived.conf man pages, that talk about these 3 options for the TCP_CHECK, without going in deeper details. So I assumed that bindto IPADDR has to be used to indicate to which IP address we should connect to do the check. But I was wrong, because with this configuration if the real server behind dies, keepalived doesn’t notice anything at all. This is because the “bindto” option, I guess, is used to choose to which local (to the LVS director) IP address bind to check the external IP:port.
So, changing the configuration to looks like this:


real_server 192.168.1.65 3306
{
TCP_CHECK
{
connect_port 3306
connect_timeout 2
}
}

fixed the problem. Keepalived is a great product and works quite well, but it’s documentation is a bit disappointing.

Posted in Fixes, LVS, Linux, Networking | Tagged: , , , | Leave a Comment »

Problems with SATA disks and Dell PE SC1435: how to fix

Posted by Vide on November 15, 2007

We have got a couple of Dell PowerEdge SC1435 (Dual Opteron) with a lspci output like this:


00:01.0 PCI bridge: Broadcom HT1000 PCI/PCI-X bridge
00:02.0 Host bridge: Broadcom HT1000 Legacy South Bridge
00:02.1 IDE interface: Broadcom HT1000 Legacy IDE controller
00:02.2 ISA bridge: Broadcom HT1000 LPC Bridge
00:03.0 USB Controller: Broadcom HT1000 USB Controller (rev 01)
00:03.1 USB Controller: Broadcom HT1000 USB Controller (rev 01)
00:03.2 USB Controller: Broadcom HT1000 USB Controller (rev 01)
00:04.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)
00:07.0 PCI bridge: Broadcom Unknown device 0140 (rev a2)
00:08.0 PCI bridge: Broadcom Unknown device 0142 (rev a2)
00:09.0 PCI bridge: Broadcom Unknown device 0144 (rev a2)
00:0a.0 PCI bridge: Broadcom Unknown device 0142 (rev a2)
00:0b.0 PCI bridge: Broadcom Unknown device 0144 (rev a2)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express (rev 21)
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express (rev 21)
03:0d.0 PCI bridge: Broadcom HT1000 PCI/PCI-X bridge (rev c0)
03:0e.0 IDE interface: Broadcom BCM5785 (HT1000) PATA/IDE Mode

it may happen that, when there is disk activity, tha SATA disk just disconnects, causing the processes using the disk to freeze for 30-60 seconds. The output in /var/log/messages could look like something like this:


ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x40000000 action 0x2 frozen
ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 512 in
res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1: port is slow to respond, please be patient (Status 0xd0)
ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/133
ata1: EH complete
SCSI device sda: 312500000 512-byte hdwr sectors (160000 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA

The solution is to put

pci=noacpi

in your Grub/Lilo configuration as parameter of the kernel you’re using. I’ve experienced this problem with kernels 2.6.18 and 2.6.20, both 32 and 64 bit

EDIT:

I’ve spoken too early, it seems that the trick doesn’t work, so we are here again with this SATA problem on these machines. Any idea from the web?

Posted in Fixes, Linux | Tagged: , , , | 2 Comments »