Four hours ago, I got the following message in dmesg — some sort of disk failure — after a warning about bus error by a program:

02:52:51 end_request: I/O error, dev sdc, sector 251463

I tried to re-mount the partition several times, every try ended with mount hanging and they all looked like the messages as shown below:

04:50:21 usb 1-3: new high-speed USB device number 22 using ehci-pci
04:50:21 usb-storage 1-3:1.0: USB Mass Storage device detected
04:50:21 usb-storage 1-3:1.0: Quirks match for vid **** pid ****: ***
04:50:21 scsi14 : usb-storage 1-3:1.0
04:50:22 scsi 14:0:0:0: Direct-Access     ******** *******          0811 PQ: 0 ANSI: 0
04:50:22 sd 14:0:0:0: [sdc] 160836480 512-byte logical blocks: (82.3 GB/76.6 GiB)
04:50:22 sd 14:0:0:0: [sdc] Test WP failed, assume Write Enabled
04:50:22 sd 14:0:0:0: [sdc] Cache data unavailable
04:50:22 sd 14:0:0:0: [sdc] Assuming drive cache: write through
04:50:22 sd 14:0:0:0: [sdc] Test WP failed, assume Write Enabled
04:50:22 sd 14:0:0:0: [sdc] Cache data unavailable
04:50:22 sd 14:0:0:0: [sdc] Assuming drive cache: write through
04:50:22  sdc: sdc1
04:50:22 sd 14:0:0:0: [sdc] Test WP failed, assume Write Enabled
04:50:22 sd 14:0:0:0: [sdc] Cache data unavailable
04:50:22 sd 14:0:0:0: [sdc] Assuming drive cache: write through
04:50:22 sd 14:0:0:0: [sdc] Attached SCSI disk
05:02:04 usb 1-3: reset high-speed USB device number 22 using ehci-pci
05:02:35 usb 1-3: reset high-speed USB device number 22 using ehci-pci
05:03:06 usb 1-3: reset high-speed USB device number 22 using ehci-pci
05:03:37 usb 1-3: reset high-speed USB device number 22 using ehci-pci
05:04:08 usb 1-3: reset high-speed USB device number 22 using ehci-pci

Note

some text are masked randomly since I have no idea if any of they are sensitive data.

Lots of reset, so I unplugged and mount quit with an error like the following:

mount: wrong fs type, bad option, bad superblock on /dev/sdc1,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

I looked a few pages on Internet, they all said if it’s a physical issue, then the situation might be end of story for the drive. I didn’t really feel anything since there are nothing important on that drive and that disk is two months after its tenth birthday. It should have ended of its story long time ago. If data is lost for sure, then so be it. It’s not as if I don’t value what’s on it, but I have no a clear idea what’s on it.

Anyway, this is a great opportunity to try fsck, here is the dmesg as I just issued fsck.ext3 on that partition:

05:04:39 usb 1-3: reset high-speed USB device number 22 using ehci-pci
05:04:39 sd 14:0:0:0: [sdc] Unhandled error code
05:04:39 sd 14:0:0:0: [sdc]
05:04:39 Result: hostbyte=0x05 driverbyte=0x00
05:04:39 sd 14:0:0:0: [sdc] CDB:
05:04:39 cdb[0]=0x28: ** 00 00 03 ** 0f 00 ** 20 00
05:04:39 end_request: I/O error, dev sdc, sector 255247
05:04:39 quiet_error: 2 callbacks suppressed
05:04:39 Buffer I/O error on device sdc1, logical block 63796
05:04:39 Buffer I/O error on device sdc1, logical block 63797
05:04:39 Buffer I/O error on device sdc1, logical block 63798
05:04:39 Buffer I/O error on device sdc1, logical block 63799
05:04:39 Buffer I/O error on device sdc1, logical block 63800
05:04:39 Buffer I/O error on device sdc1, logical block 63801
05:04:39 Buffer I/O error on device sdc1, logical block 63802
05:04:39 Buffer I/O error on device sdc1, logical block 63803
05:05:10 usb 1-3: reset high-speed USB device number 22 using ehci-pci
05:05:41 usb 1-3: reset high-speed USB device number 22 using ehci-pci
05:06:12 usb 1-3: reset high-speed USB device number 22 using ehci-pci
05:06:43 usb 1-3: reset high-speed USB device number 22 using ehci-pci
05:07:14 usb 1-3: reset high-speed USB device number 22 using ehci-pci

Here is the output of fsck.ext3:

e2fsck 1.42.7 (21-Jan-2013)
/backup: recovering journal
Error reading block 31425 (Attempt to read block from filesystem resulted in short read).  Ignore error<y>? yes
Force rewrite<y>? yes
Error reading block 31546 (Attempt to read block from filesystem resulted in short read).  Ignore error<y>? yes
Force rewrite<y>? yes
Error reading block 31900 (Attempt to read block from filesystem resulted in short read).  Ignore error<y>? yes
Force rewrite<y>? yes
Error reading block 32021 (Attempt to read block from filesystem resulted in short read).  Ignore error<y>? yes
Force rewrite<y>? yes
Error reading block 32522 (Attempt to read block from filesystem resulted in short read).  Ignore error<y>? yes
Force rewrite<y>? yes
Error reading block 32643 (Attempt to read block from filesystem resulted in short read).  Ignore error<y>? yes
Force rewrite<y>? yes
/backup has been mounted 192 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inode 5472278, i_blocks is 196776, should be 188584.  Fix<y>? yes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences:  -(18404188--18404193) -(18404400--18404608) -(18404616--18404735) -(18404776--18404808) -(18404816--18404866) -(18404872--18405185) -(18405192--18405213) -(18405250--18405255) -(18405267--18405270) -(18405432--18405555) -(18471560--18471580) -(18471585--18471683) -(18477619--18477630) -(18477936--18477937) -18477944
Fix<y>? yes
Free blocks count wrong for group #561 (0, counted=889).
Fix<y>? yes
Free blocks count wrong for group #563 (0, counted=135).
Fix<y>? yes
Free blocks count wrong (10922249, counted=9658656).
Fix<y>? yes
Free inodes count wrong (10058397, counted=10058993).
Fix<y>? yes

/backup: ***** FILE SYSTEM WAS MODIFIED *****
/backup: 783/10059776 files (76.1% non-contiguous), 10444675/20103331 blocks
[exit status = 1]

Since I had no idea what fsck would do, I just kept pressing Enter for default action. The fixing process took about an hour. After it’s done, the partition is now mounted without any issues.

Is there any corruptions or loss of files? Well, I have no idea. Even there is, the result is much better than my last resort that I had in mind, which is re-formatting the partition. Now the result is much better, no re-formatting, no running special rescue programs, I can’t be happier.

Would I get another error like this, I haven’t got a fsck clue. Deep in my mind, this drive is at least half way into its grave, it’s only the matter of time.