database - What are the minimum hardware requirement & setup for a PostgreSQL cluster with Rails frontend with a lot of small writes and long reads? -
background:
trying figure out minimum requirements app i'm building. i'm fluent in mysql , postgresql developer, i'm not dba, hence question. i'm building mobile app, talk remote api, , need figure out requirements api. @ point i'm doing hobby project , mobile app going free, don't have big budget - , need figure out requirements close possible.
application requirements:
remote api done in rails providing web & json interfaces, stores data in postgresql cluster. mobile app send lot of short writes ~ 1 every minute * 20,000 app installations. of reads going - report style, longer reads, don't happen - maybe once or twice day per user. db needs optimized writes. read actions can redirected slave cluster / server, @ point don't need real time. 1 day delay fine.
more details per question in comments:
1) writes small: i'll sending kine of auth token ( api key) , little data - talking less thank 1kb of data: timestamp, , gps coordinates, maybe else eventually, doubt it. don't envision large data pictures or that. it's going similar running / jogging / biking tracking app.
2) scaling up? hmm. 200,000 - 400,000 apps max if takes off within first 2 years.
3) data critical. whole point able run accurate reports once data collected. there 2 options mitigate issue:
- i can estimate based on google maps data , last known points ( right before data lost, , right after connection reestablished.)
- data first saved on phone in sqlite storage , once day ( or @ app start up) it's synced server / verified. once verification / synchronization successful data on phone can rotated ( older 1 month can wiped of phone)
actual question details
so question dealt apps on scale, initial postgresql setup - both cluster configuration , hardware(cloud)-wise, , how easy / difficult scale?
to prevent irrelevant suggestions , answers:
nosql alternatives
i considered nosql alternatives couchdb, mongodb, etc . riak came out winner, considering that's it's easy manage one-man team , need 3 db servers have working replicating cluster. after mapping out app, figured nosql not fit app, belongs in realm of rdbms's .
nosql alternatives & sql options
considering non-existent budget didn't consider sql server , oracle , such. mysql other real alternative, need hstore, , replication easier implement right in postgresql imho.
this news:
data first saved on phone in sqlite storage...
so - we're not having cope bursts of small writes can batch updates together. more, can reject them , app can try again later. good, can rent monthly rather hourly (cheaper!).
that means our limit purely down maximum sustainable disk i/o. now, mention "the cloud" complicates things. cheap disk i/o typically poor (any type of) database , stuff expensive.
some back-of-envelope calculations...
20,000 apps @ 1kb / min ~ 20 mb/min ~ 333 kb/sec 200,000 apps @ 1kb / min ~ 200 mb/min ~ 3.3 mb/sec
now you'll writing wal (transaction log) first , tables, , need allow reporting that's not @ all. if disk requirements increase might better off couple of managed, real machines own disks.
so - script postgresql server setup. find ansible easy going with. add test scripts simulate different numbers of requests , batch sizes. should able spin vm, , run batch of tests , real figures in couple of hours per provider.
Comments
Post a Comment