database - What are the minimum hardware requirement & setup for a PostgreSQL cluster with Rails frontend with a lot of small writes and long reads? -


background:

trying figure out minimum requirements app i'm building. i'm fluent in mysql , postgresql developer, i'm not dba, hence question. i'm building mobile app, talk remote api, , need figure out requirements api. @ point i'm doing hobby project , mobile app going free, don't have big budget - , need figure out requirements close possible.

application requirements:

remote api done in rails providing web & json interfaces, stores data in postgresql cluster. mobile app send lot of short writes ~ 1 every minute * 20,000 app installations. of reads going - report style, longer reads, don't happen - maybe once or twice day per user. db needs optimized writes. read actions can redirected slave cluster / server, @ point don't need real time. 1 day delay fine.

more details per question in comments:

1) writes small: i'll sending kine of auth token ( api key) , little data - talking less thank 1kb of data: timestamp, , gps coordinates, maybe else eventually, doubt it. don't envision large data pictures or that. it's going similar running / jogging / biking tracking app.

2) scaling up? hmm. 200,000 - 400,000 apps max if takes off within first 2 years.

3) data critical. whole point able run accurate reports once data collected. there 2 options mitigate issue:

  • i can estimate based on google maps data , last known points ( right before data lost, , right after connection reestablished.)
  • data first saved on phone in sqlite storage , once day ( or @ app start up) it's synced server / verified. once verification / synchronization successful data on phone can rotated ( older 1 month can wiped of phone)

actual question details

so question dealt apps on scale, initial postgresql setup - both cluster configuration , hardware(cloud)-wise, , how easy / difficult scale?


to prevent irrelevant suggestions , answers:

nosql alternatives

i considered nosql alternatives couchdb, mongodb, etc . riak came out winner, considering that's it's easy manage one-man team , need 3 db servers have working replicating cluster. after mapping out app, figured nosql not fit app, belongs in realm of rdbms's .

nosql alternatives & sql options

considering non-existent budget didn't consider sql server , oracle , such. mysql other real alternative, need hstore, , replication easier implement right in postgresql imho.

this news:

data first saved on phone in sqlite storage...

so - we're not having cope bursts of small writes can batch updates together. more, can reject them , app can try again later. good, can rent monthly rather hourly (cheaper!).

that means our limit purely down maximum sustainable disk i/o. now, mention "the cloud" complicates things. cheap disk i/o typically poor (any type of) database , stuff expensive.

some back-of-envelope calculations...

20,000 apps @ 1kb / min ~ 20 mb/min ~ 333 kb/sec 200,000 apps @ 1kb / min ~ 200 mb/min ~ 3.3 mb/sec

now you'll writing wal (transaction log) first , tables, , need allow reporting that's not @ all. if disk requirements increase might better off couple of managed, real machines own disks.

so - script postgresql server setup. find ansible easy going with. add test scripts simulate different numbers of requests , batch sizes. should able spin vm, , run batch of tests , real figures in couple of hours per provider.


Comments

Popular posts from this blog

c# - Send Image in Json : 400 Bad request -

jquery - Fancybox - apply a function to several elements -

An easy way to program an Android keyboard layout app -