Thread: [GENERAL] PostgreSQL on eMMC - Corrupt file system

[GENERAL] PostgreSQL on eMMC - Corrupt file system

From
Thomas Güttler
Date:
Hi PostgreSQL experts,

Today the root files system on my hardware running ubuntu was set to readonly, because the file system had some
internaltrouble. 

Output of `dmesg` is below.

I am running owncloud with PostgreSQL on the hardware.

What could be the root of this problem?

It happened for the third time in two days ...

Please tell me, if you need more information (cat /proc/???)

The partition `/dev/mmcblk0p2` is the root file system (eMMC).

I guess I was too naive. It looks that running linux on eMMC needs some special considerations:
http://unix.stackexchange.com/questions/136269/corruption-proof-sd-card-filesystem-for-embedded-linux

I guess that PostgreSQL does too much file system operations, too drive the file system on eMMC "crazy".

This is my hardware: Intel Cherry Trail x5-Z8350, On board DDR3L 4GB memory, On board 32GB eMMC storage
http://up-shop.org/home/81-up-gws01w4g-memory32g-emmc-boardwo-vesa-plate.html

# Question

Is running linux with postgres on eMMC a bad idea in general? Or is my hardware broken?

Regards,
  Thomas Güttler

output of dmesg:

    [18471.780031] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to
fixedsampling clock 
    [18481.816797] mmc0: Timeout waiting for hardware interrupt.
    [18481.818821] ------------[ cut here ]------------
    [18481.818866] WARNING: CPU: 1 PID: 0 at /build/linux-W6HB68/linux-4.4.0/drivers/mmc/host/sdhci.c:1017
sdhci_send_command+0x714/0xc30[sdhci]() 
    [18481.818877] Modules linked in: xhci_plat_hcd nls_iso8859_1 dwc3 udc_core ulpi intel_rapl intel_powerclamp
coretempkvm_intel kvm irqbypass punit_atom_debug crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
aes_x86_64lrw gf128mul glue_helper ablk_helper cryptd input_leds uas usb_storage lpc_ich shpchp snd_intel_sst_acpi
snd_intel_sst_coresnd_soc_sst_mfld_platform snd_soc_core mei_txe snd_compress ac97_bus snd_pcm_dmaengine mei snd_pcm
dwc3_pcisnd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq dw_dmac snd_seq_device dw_dmac_core 8250_fintek snd_timer
sndintel_hid sparse_keymap soundcore i2c_designware_platform i2c_designware_core tpm_crb spi_pxa2xx_platform
pwm_lpss_platform8250_dw pwm_lpss int3400_thermal int3403_thermal int340x_thermal_zone acpi_thermal_rel mac_hid
acpi_padparport_pc 
    [18481.819271]  ppdev lp parport autofs4 hid_generic usbhid hid mmc_block i915 i2c_algo_bit drm_kms_helper
syscopyareasysfillrect sysimgblt fb_sys_fops drm r8169 mii fjes video sdhci_acpi sdhci pinctrl_cherryview 
    [18481.819423] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.4.0-62-generic #83-Ubuntu
    [18481.819435] Hardware name: AAEON UP-CHT01/UP-CHT01, BIOS UPC1BM0S 06/04/2016
    [18481.819447]  0000000000000086 7f0cf1c4636dff29 ffff88017fc83d28 ffffffff813f7c63
    [18481.819468]  0000000000000000 ffffffffc0047460 ffff88017fc83d60 ffffffff810812d2
    [18481.819487]  ffff88017aa2f540 ffff880176779ad8 0000000000000010 0000000000000001
    [18481.819506] Call Trace:
    [18481.819521]  <IRQ>  [<ffffffff813f7c63>] dump_stack+0x63/0x90
    [18481.819571]  [<ffffffff810812d2>] warn_slowpath_common+0x82/0xc0
    [18481.819588]  [<ffffffff8108141a>] warn_slowpath_null+0x1a/0x20
    [18481.819612]  [<ffffffffc0044914>] sdhci_send_command+0x714/0xc30 [sdhci]
    [18481.819634]  [<ffffffff81404dbb>] ? __const_udelay+0x2b/0x30
    [18481.819656]  [<ffffffffc0041d49>] ? sdhci_reset+0x59/0xc0 [sdhci]
    [18481.819679]  [<ffffffffc0044f32>] sdhci_finish_data+0x102/0x350 [sdhci]
    [18481.819702]  [<ffffffffc0045180>] ? sdhci_finish_data+0x350/0x350 [sdhci]
    [18481.819724]  [<ffffffffc00451fb>] sdhci_timeout_timer+0x7b/0xc0 [sdhci]
    [18481.819746]  [<ffffffff810ecd55>] call_timer_fn+0x35/0x120
    [18481.819768]  [<ffffffffc0045180>] ? sdhci_finish_data+0x350/0x350 [sdhci]
    [18481.819785]  [<ffffffff810ed70a>] run_timer_softirq+0x23a/0x2f0
    [18481.819808]  [<ffffffff81085db1>] __do_softirq+0x101/0x290
    [18481.819828]  [<ffffffff810860b3>] irq_exit+0xa3/0xb0
    [18481.819847]  [<ffffffff8183b0a2>] smp_apic_timer_interrupt+0x42/0x50
    [18481.819868]  [<ffffffff81839362>] apic_timer_interrupt+0x82/0x90
    [18481.819876]  <EOI>  [<ffffffff816cb5d1>] ? cpuidle_enter_state+0x111/0x2b0
    [18481.819913]  [<ffffffff816cb7a7>] cpuidle_enter+0x17/0x20
    [18481.819934]  [<ffffffff810c4522>] call_cpuidle+0x32/0x60
    [18481.819950]  [<ffffffff816cb783>] ? cpuidle_select+0x13/0x20
    [18481.819969]  [<ffffffff810c47e0>] cpu_startup_entry+0x290/0x350
    [18481.819989]  [<ffffffff81051784>] start_secondary+0x154/0x190
    [18481.820005] ---[ end trace 3081f620d5ceb477 ]---
    [18481.822972] mmcblk0: error -110 sending stop command, original cmd response 0x0, card status 0x400900
    [18481.823076] mmcblk0: error -110 transferring data, sector 26544216, nr 136, cmd response 0x0, card status 0x0
    [18481.823131] blk_update_request: I/O error, dev mmcblk0, sector 26544216
    [18481.823156] blk_update_request: I/O error, dev mmcblk0, sector 26544224
    [18481.823173] blk_update_request: I/O error, dev mmcblk0, sector 26544232
    [18481.823202] blk_update_request: I/O error, dev mmcblk0, sector 26544240
    [18481.823219] blk_update_request: I/O error, dev mmcblk0, sector 26544248
    [18481.823234] blk_update_request: I/O error, dev mmcblk0, sector 26544256
    [18481.823250] blk_update_request: I/O error, dev mmcblk0, sector 26544264
    [18481.823291] blk_update_request: I/O error, dev mmcblk0, sector 26544272
    [18481.823309] blk_update_request: I/O error, dev mmcblk0, sector 26544280
    [18481.823336] blk_update_request: I/O error, dev mmcblk0, sector 26544288
    [18481.823622] Aborting journal on device mmcblk0p2-8.
    [18481.827415] EXT4-fs error (device mmcblk0p2): ext4_journal_check_start:56: Detected aborted journal
    [18481.827446] EXT4-fs (mmcblk0p2): Remounting filesystem read-only
    [18481.829542] EXT4-fs error (device mmcblk0p2): ext4_journal_check_start:56: Detected aborted journal



--
http://www.thomas-guettler.de/


Re: [GENERAL] PostgreSQL on eMMC - Corrupt file system

From
Thomas Güttler
Date:
Am 08.02.2017 um 07:25 schrieb Thomas Güttler:
> Hi PostgreSQL experts,
>
> ...


# Update

After following the hints from [this answer][1], I could sync via owncloud for hours, and no file system error occurs.
Thisis no big surprise since now only very few io-operations happen on the eMMC. Here is what I did: 

 - attach external tradition hard disk
 - put postgres and /var/log on external disk
 - disable swap
 - use ramfs for /tmp


But above questions still remains:

 Is running linux with postgres on eMMC a bad idea in general?



  [1]: http://%20http://raspberrypi.stackexchange.com/questions/169/how-can-i-extend-the-life-of-my-sd-card/186#186


BTW, I asked the same question here: http://askubuntu.com/questions/880947/linux-on-emmc-corrupt-file-system



--
http://www.thomas-guettler.de/


Re: [GENERAL] PostgreSQL on eMMC - Corrupt file system

From
Christoph Moench-Tegeder
Date:
## Thomas Güttler (guettliml@thomas-guettler.de):

>  Is running linux with postgres on eMMC a bad idea in general?

I'd say that running anything with a read-write load on eMMC will
end in pieces. It's ok to occasionally write something, but a mixed
load is not really what these things were designed for. The wear
leveling can be quite basic, you never know when it's gonna happen
(i.e. sudden power down can kill your filesystem - that's why disabling
journaling is not a very great idea), and if your device is "mostly
full" anyways, the wear leveling has not much space to redirect the
writes to. Remember that some of those chips are sold mostly by
price - that is, the hobbyist "embedded" devices get the cheapest
chips. A safer bet would be adding an external storage; some
64GB SATA SSDs are available for less than 50€ (perhaps it's better
not to go for the cheapest ones here, too).

Regards,
Christoph

--
Spare Space


Re: [GENERAL] PostgreSQL on eMMC - Corrupt file system

From
Mark Morgan Lloyd
Date:
On 09/02/17 23:00, Christoph Moench-Tegeder wrote:
> ## Thomas Güttler (guettliml@thomas-guettler.de):
>
>>  Is running linux with postgres on eMMC a bad idea in general?
>
> I'd say that running anything with a read-write load on eMMC will
> end in pieces. It's ok to occasionally write something, but a mixed
> load is not really what these things were designed for. The wear
> leveling can be quite basic, you never know when it's gonna happen
> (i.e. sudden power down can kill your filesystem - that's why disabling
> journaling is not a very great idea), and if your device is "mostly
> full" anyways, the wear leveling has not much space to redirect the
> writes to. Remember that some of those chips are sold mostly by
> price - that is, the hobbyist "embedded" devices get the cheapest
> chips. A safer bet would be adding an external storage; some
> 64GB SATA SSDs are available for less than 50€ (perhaps it's better
> not to go for the cheapest ones here, too).

I agree, but three additional comments. First, we've got a fair number
of RPis running their root filesystems on the internal SD-Card without
problems, but the one Odroid which runs an eMMC card failed a few weeks
ago. Second, a useful precaution is to put stuff which will be updated
on an external device, although the same longevity concerns apply if
it's Flash-based. Third, experience here suggests that reliability
/might/ be improved if you fully zero a device before partitioning it to
make absolutely sure that the internal controller has touched every block.

--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]


Re: [GENERAL] PostgreSQL on eMMC - Corrupt file system

From
Thomas Güttler
Date:
Am 10.02.2017 um 09:16 schrieb Mark Morgan Lloyd:
> On 09/02/17 23:00, Christoph Moench-Tegeder wrote:
>> ## Thomas Güttler (guettliml@thomas-guettler.de):
>>
>>>  Is running linux with postgres on eMMC a bad idea in general?
>>
>> I'd say that running anything with a read-write load on eMMC will
>> end in pieces. It's ok to occasionally write something, but a mixed
>> load is not really what these things were designed for. The wear
>> leveling can be quite basic, you never know when it's gonna happen
>> (i.e. sudden power down can kill your filesystem - that's why disabling
>> journaling is not a very great idea), and if your device is "mostly
>> full" anyways, the wear leveling has not much space to redirect the
>> writes to. Remember that some of those chips are sold mostly by
>> price - that is, the hobbyist "embedded" devices get the cheapest
>> chips. A safer bet would be adding an external storage; some
>> 64GB SATA SSDs are available for less than 50€ (perhaps it's better
>> not to go for the cheapest ones here, too).
>
> I agree, but three additional comments. First, we've got a fair number of RPis running their root filesystems on the
internalSD-Card without problems, but the one Odroid which runs an eMMC card failed a few weeks ago. Second, a useful
precautionis to put stuff which will be updated on an external device, although the same longevity concerns apply if
it'sFlash-based. Third, experience here suggests that reliability /might/ be improved if you fully zero a device before
partitioningit to make absolutely sure that the internal controller has touched every block. 


to fully zero the device ... sounds reasonable. Thank you for sharing your knowledge.

BTW, I moved postgres and /var/log to an external disc. I removed swap from eMMC and use tmpfs for /tmp.

Since these changes, I had no failure any more.

In this case it is just a small server for my personal environment.

But still I have a bad feeling ...


Regards,

  Thomas Güttler


--
http://www.thomas-guettler.de/