Buffer I/O errors

참고 링크 : http://website-humblec.rhcloud.com/buffer-io-errors-in-my-red-hat-enterprise-linux-system/

multipath로 구성된 환경중 일부 환경에서 아래와 같은 오류 메세지가 나오는 경우가 있다.

Jan 13 13:40:40 humble-node kernel: Buffer I/O error on device sda, logical block 0
Jan 13 13:40:40 humble-node kernel: Buffer I/O error on device sdd, logical block 0
Jan 13 13:40:40 humble-node kernel: Buffer I/O error on device sdd, logical block 1
Jan 13 13:40:40 humble-node kernel: Buffer I/O error on device sdd, logical block 2
Jan 13 13:40:40 humble-node kernel: Buffer I/O error on device sdd, logical block 3
Jan 13 13:40:40 humble-node kernel: Buffer I/O error on device sdd, logical block 4
Jan 13 13:40:40 humble-node kernel: Buffer I/O error on device sdd, logical block 5
Jan 13 13:40:40 humble-node kernel: Buffer I/O error on device sdd, logical block 6
Jan 13 13:40:40 humble-node kernel: Buffer I/O error on device sdd, logical block 7
Jan 13 13:40:40 humble-node kernel: Buffer I/O error on device sda, logical block 0

또한 lvm 관련 명령어를 치게 되면 아래와 같은 오류 메세지가 같이 보이는 경우가 있다.

[root@host ~]# pvs
  /dev/sdq: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdb: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdd: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdf: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdh: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdj: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdm: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdo: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
...
  
[root@host ~]# vgs
  /dev/sdq: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdb: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdd: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdf: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdh: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdj: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdm: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdo: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
...
  
[root@host ~]# lvs
  /dev/sdq: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdb: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdd: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdf: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdh: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdj: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdm: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  /dev/sdo: read failed after 0 of 4096 at 0: ìž…ë ¥/ì¶œë ¥ 오류
  ...
[root@host ~]#

위와같은 오류가 항상 발생하면서 해당 디스크/파일시스템은 정상적으로 사용가능한경우 스토리지 장비의 설정과 호스트간의 multipath 설정에 차이가 있어서 발생하는 문제이다.

참고 링크를 보면 알겠지만 호스트는 멀티패스를 통해 모든 패스로 접근을 시도하는데 스토리지장비는 Active-Passive 설정이 되어있는 경우다.(즉 모든 패스로 응답하는게 아니라 액티브 패스로만 응답하게 셋팅되어있는경우)

그래서 호스트에서는 패스1과 패스2로 양쪽으로 응답이 와야하는데 스토리지에서는 한쪽으로만 응답을 하기 때문에 위와같은 오류가 발생하는것이다.

이런경우 multipath의 설정을 스토리지 설정과 맞춰주면 오류가 없어진다.