利用live upgrade 可以split 或者merge 文件系统

十二月 9, 2008 由 shyjack

live upgrade除了可以用来在线升级操作系统,还可以用来split或者merge文件系统。

假如原有系统为solaris 10, 而且有两个硬盘 c0t0d0 &c0t1d0。 系统运行在 c0t0d0上,而且系统只有 根 (/) 和 swap 文件系统,现在想把 /usr 单独分离成一个文件系统,可以采取以下步骤分离

1. mount -F lofs /usr /usr

2. lucreate -c “sol10_old” -m /:/dev/dsk/c0t1d0s0:ufs \

                                                    -m -:/dev/dsk/c0t1d0s1:swap\

                                                    -m /usr:/dev/dsk/c0t1d0s3:ufs -n “sol10_new”

3. luactivate sol10_new

经过以上步骤就成功在c0t1d0上,将 /usr 分离出来,形成单独的文件系统了。 这里的技巧就是一定要 将 /usr mount成文件系统,因此 原来的/usr 只是个目录,因此可以用loopback 文件系统来愚弄 live upgrade

 

反之,我们也可以用 live upgrade 来merge文件系统。假如sol10运行在 c0t1d0上,有/  /usr swap 三个文件系统, 通过一下步骤,可以将/usr 合并到 / 中。

1. lucreate -c “sol1o_old” -m /:/dev/dsk/c0t0d0s0:ufs \

                                                   -m -:/dev/dsk/c0t0d0s1:swap -n “sol10_new”

2. luactivate sol10_new

VCS/VXVM upgrade & downgrade

十二月 9, 2008 由 shyjack

公司的一个两节点的VCS 5.0  为了测试即将到来的应用系统升级,因此拿这个系统开刀,进行测试。虽然这个套系统一直还未投入使用,但上面也有1T的生产数据,所以操作还是很小心的,而且用BCV做了备份。

原有系统是solaris 9 + vcs/vxvm 5.0, 整个操作经过了以下阶段

1. vcs/vxvm downgrade to 4.1

2. 操作系统升级到 sol10 11/06 & vcs/vxvm 4.1 re-install

3. 操作系统升级到 sol10 05/08

4. vcs/vxvm 4.1 升级到 5.0mp1

整个过程波澜不惊,没什么可圈可点,只是第一步vcs/vxvm 从5.0 downgrade 到4.1 值得提一下。

vcs 的downgrade很简单,比较straightforward.

倒是vxvm 的downgrade比较tricky,因此veritas disk group 是不支持从5.0到4.1的downgrade,我特意与symantec 开了个case,他们也确认了这点。没办法,只有自己想招了。

无论4.1 还是 5.0 , dg肯定还是不变的分为private &public region,只要保持privgate &public region 的大小,可以重建dg,来恢复数据,本着这个想法,经过摸索,终于成功讲vxvm downgrade 到4.1

1. backup vx dg information under v5

    vxprint -Qqmhspv -g sftg_datadg > dg1.txt

    verify configuration

    vxprint -D – -ht < dg1.txt

1.5

   check disk info

   vxprint -qd -g <dg_name>

 

 

2. destroy dg under vxvm v5.

    vxdg destory sftg_datadg

 

3. downgrade vxvm software from v5 to v4.1

    

4. Verify configuration under v4.1

vxprint -D – -ht < dg1.txt

 

delete those features that are not supported by v4.1 from dg1.txt, until dg1.txt is accepted by vxprint command

proxy_rid=0.1029

readonly=off

cons_reattach=off

fmr_rec_needed=off

voltype=off

siteconsistent=off

allsites=off

export=

site=

sd_name=

uber_name=

tentmv_src=off

tentmv_tgt=off

tentmv_pnd=off

 

5. create dg under vxvm 4.1

   vxdg -T 110 init sftg_datadg disk01=EMC0_1

 

5. recreate volume with configuration above

    vxmake -g sftg_datadg -d dg1.txt

6. start dg/vol

    vxvol -g sftg_datadg init active sftg

    mount /dev/vx/dsk/sftg_datadg/sftg /export/sftg

写在这里,便于自己查找,也希望能帮到其他人。

快过年的感觉来了。

十二月 7, 2008 由 shyjack

快到圣诞节了,假期也多起来了,工作也没那么忙了。

圣诞节日可以连续休息4天,而且29日又是公司的picnic day,休息一天,然后马上是新年,休息一天,1月底又是过中国年了,我又申请了一周的假期,二月的第三个星期,又休息一周,

因此接下来的一段时间,会比较轻闲一些。

公司也挺有意思,年假超过20天,hr就陆续发信,劝大家休假,咨询了一下老人,公司曾经发生过一下子有5,6个人,连续休假7,8个星期,给工作的运维造成了一定的不便,因此现在公司的政策是超过20天就开始劝休。 我去年休的假期基本都是用的personal leave, 因此年假还是很多的。

仔细算算,

每年公众假日9天,公司额外给一天带薪picinic day,

10天的带薪事假

20天的带薪年假,

总共是40天,

也就是可以休息8周,接近2个月,算下来还是比较满意的。

在solaris10 zone环境下部署VCS5

十二月 7, 2008 由 shyjack

几个月前,自己设计并实施了solaris10 zone环境下VCS5的应用。整个系统框架如下:

vr2

系统为solaris10 05/08,两个节点,每个节点分别建3个zone,其中2个运行应用软件,1个运行oracle 10 数据库,运行数据库的zone内,运行三个oracle实例。应用程序和每个数据库实例都要求可以在两个节点内进行切换。集群软件采用veritas storage foundation5 MP1。 整个安装配置过程比较顺利,这里把需要注意的地方罗列一下
1. 由于三个local zone均采用exclusive-ip,也就是每个zone都分配了2个网卡,两个网卡都dedicated给相应的local zone。VCS5自带的IP agent只支持shared-ip local zone, 因此自己写了ZoneVip agent,通过zonecfg进行动态的配置ip。因为采用了exclusive-ip, local zone内有独立的ip stack,因此IPMP都是部署在local zone内,而且为了节省ip,采用link-based IPMP.

2. 由于VCS daemon必须运行在global zone内,而应用程序运行在local zone内,为了确保agent能够监测运行在local zone内的resource,而且在必要的时候进行相应的动作,inter-zone 的通信必要要建立起来,方法是在global zone内,建立好相应的vcs 用户,付好权限,在local zone内,首先通过环境变量VCS_HOST 指向global zone,然后执行 halogin username password, 进行登录操作,如果成功,会在local zone的跟下生成 .vcspwd 文件。

3. 一定要先建好local zone,挂接好 /opt 文件系统,然后在global zone内进行VCS/VXVM 安装,这样 veritas的相应package就都自动安装到了non-global zone 了。

vxvm issue- vxdisk list can only see the internal disk

十二月 5, 2008 由 shyjack

 

On the new M4000, with install of vxvm/vcs 4.1, we came across a problem that “vxdisk list” only recognize the internal disks, it ignores all the EMC disk, though they are visiable from format command.

Had tried “vxdctl enable” “vxdisk scandisks”, “cfgadm -c configure c2; cfgadm -c configure c4″ “restart vxvm daemon”, still no luck.

Since solaris is already able to see and use the EMC disk, something wrong with the device descovery with Veritas vxvm. So we digged around with vxddladm command.

drvapp11:/opt/VRTS/man # vxddladm listsupport all

LIBNAME VID PID

==============================================================================

libvxap.so SUN All

libvxatf.so VERITAS ATFNODES

libvxcscovrts.so CSCOVRTS MDS9

libvxeccs.so ECCS All

libvxemc.so EMC SYMMETRIX <————————————- here is the library to discover EMC disks

libvxfujitsu.so FUJITSU GR710, GR720, GR730

libvxhds.so HITACHI All

libvxhitachi.so HITACHI DF350, DF400, DF400F

libvxlsiinf.so LSI INF-01-00

libvxnec.so NEC DS1200, DS1200F, DS3000SL

libvxpurple.so SUN T300

libvxrdac.so VERITAS RDACNODES

libvxsena.so SENA All

libvxshark.so IBM 2105

libvxssa.so SSA SSA

libvxstorcomp.so StorComp OmniForce

libvxveritas.so VERITAS All

libvxvpath.so IBM 2105

libvxxp256.so HP All

drvapp11:/opt/VRTS/man #

drvapp11:/root # vxdisk list <——– no EMC disks are recognized.

DEVICE TYPE DISK GROUP STATUS

c0t0d0s2 auto:none – - online invalid

c0t1d0s2 auto:none – - online invalid

drvapp11:/root #

drvapp11:/root # vxddladm excludearray libname=libvxemc.so <————-here exclude the library for EMC and re-scan disks

drvapp11:/root # vxdctl enable

drvapp11:/root # vxdisk list

DEVICE TYPE DISK GROUP STATUS

Disk_0 auto:cdsdisk – - online

Disk_1 auto:cdsdisk – - online

Disk_2 auto:cdsdisk – - online

Disk_3 auto:cdsdisk – - online

c0t0d0s2 auto:none – - online invalid

c0t1d0s2 auto:none – - online invalid

drvapp11:/root #

After includearray of libvxemc.so, the disks will be gone again, we must exclude this library to get it going.

Not sure what is the side effects of excluding libvxemc.so, Rohana had logged a call with Veritas and will follow it up.

regarding problem of “X11 connection rejected because of wrong authentication”

十二月 3, 2008 由 shyjack

Came across this problem and digged a while on google, some threads said it has something to do with the low free space for fs / or /var,  but this doesn’t apply my case as I still have heap on every file systems.

Further truss the client ssh session, it is showing something that can not query the security extension,  which sheds a bit light on how to resolve it.

from client side, “ssh -X” definitately will forward X11 as we can see the $DISPLAY has been set and “netstat -an | grep 6012 (provided X11 was forwarded to :12.0)” is showing 6012 is listening.  But trusted X11 was missed.  The fix is to “ssh -Y -X” to forward both X11 and trusted X11, alternatively, you can put “ForwardX11Trusted yes” into your client’s ssh_config to make this behavious as default.

Eric

02/Dec/08

directory-based quota

十一月 27, 2008 由 shyjack

The other day I got the requirement to setup this directory-based quotation, after digging around for a while, I noticed that it looks not easy to achieve.  The native solaris is mean to setup quota based on file system.  without the support from the kernal, we have to look for another workarounds.

Say you got file sytem /export/test,  and you want to limit /export/test/a1 and /export/test/a2 to 100M and 200M respectively.  the workaround is to use loopback devices.

mkfile 100M /export/test/a1_f

mkfile 200M /export/test/a2_f

lofiadm -a /export/test/a1_f

lofiadm -a /export/test/a2_f

newfs /dev/rlofi/1

newfs /dev/rlofi/2

mount /dev/lofi/1 /export/test/a1

mount /dev/lofi/2 /export/test/a2

tip-to hide device temporarily on solaris platform

十一月 17, 2008 由 shyjack

very occassionally, you might need to disable the physical device to test something and you also wanna it cannot be probed when solaris is up running.   Here we go for it

1. under openboop, run delete-device like below

{0} ok ” /pci@7c0/pci@0/pci@1/pci@0,2/QLGC,qla@1” $delete-device drop

pls note the space under first double-quot

 

2. do a reconfiguration boot “boot -r”

Then from Solaris level, you will not be able to see this HBA again.

回到家里

十一月 16, 2008 由 shyjack

从公司回到家里。因为后台系统周末做升级,虽然这次不是我主刀,但是on-call,太多电话了,一会一个。索性跑到楼下客厅,专心上网,一边解case,一边看电影。

the peacemaker:  影片讲述俄国偷盗核弹头,最后经由恐怖分子带到纽约。影片本身没什么新意,我是奔着主演Nicole Kidman来的,解决并没觉得太漂亮。倒是里面的头号反派内心刻画的比较细腻,能够真切的感受他所遭受的痛苦。推介指数2星。

traitor:  这部影片我个人还是非常推介,讲述关于宗教,反恐等活动。情节紧张刺激,关于宗教也值得人深思,个人感觉推介指数5星。

a very bad on-call weekend

十一月 16, 2008 由 shyjack

I was on-call this weekend and I had to say this is really tough. I kept being called by helpdesk, by getronics and by unix team.  I had to explain again and again that this is per normal, this is per change and blah,blah.

At the meantime, I also did some upgrades, which took me ages to track down issues with some SMFs failed to start.  But no matter how tedious and how diffcult it is, I made it finally.

This is work for life and live for work.